Minimum Description Length Model Selection for Multinomial Processing Tree Models
|
|
- Geoffrey Long
- 5 years ago
- Views:
Transcription
1 Minimum Description Length Model Selection for Multinomial Processing Tree Models Hao Wu Jay I. Myung William H. Batchelder Ohio State University Ohio State University University of California, Irvine December 25, 2008 Word count of text: 4000 Running head: Model Selection Corresponding author and address: Hao Wu Department of Psychology Ohio State University 1835 Neil Avenue Columbus, Ohio Tel:
2 Abstract Multinomial processing tree (MPT) modeling, since its introduction in the 1980s, has been widely and successfully applied as a statistical tool for measuring hypothesized latent cognitive processes in selected experimental paradigms. This paper concerns the problem of selection among a set of scientifically plausible MPT models given observed data. The likelihood ratio test (LRT) is often employed in model selection for MPT models, but it comes with a number of assumptions and constraints that can limit its applicability in practice. In this paper, we introduce a minimum description length (MDL) based model selection approach for MPT modeling that overcomes some of the shortcomings of LRT. The application of the MDL approach is demonstrated for a well studied family of MPT models for the source-monitoring paradigm. The results indicate that the application of MDL in model selection can lead to scientific conclusions about underlying cognitive functioning that are different from those based on LRT. 1 Introduction Multinomial processing tree (MPT) modeling is a statistical methodology for measuring latent cognitive capacities in selected experimental paradigms (Batchelder & Riefer, 1990, 1999). In such experiments, participants make categorical responses to a series of test items, and an MPT model parameterizes a subset of probability distributions over the response categories by specifying a processing tree designed to represent hypothesized cognitive steps in performing a cognitive task, such as memory encoding, storage, discrimination, inference, guessing, and retrieval. Since its introduction in the 1980s, MPT models have been successfully applied to modeling performance in a range of cognitive tasks in many areas including memory, perception, and social psychology. This paper is concerned with the problem of selecting the best MPT model from a set of scientifically plausible MPT models that are available to account for a given set of data. A researcher may entertain multiple scientific hypotheses about the underlying processes, each formulated as a distinct MPT model, and wish to determine which one of the models best describes observed data in some defined sense. This is the model selection problem (Myung & Pitt, 1997). The G 2 -based likelihood ratio test (LRT) is often employed in model evaluation and selection for MPT models (e.g., Hu & Batchelder, 1994). The basis of this approach is to evaluate the adequacy of a model in the null hypothesis significance testing framework. Although LRT is generally useful for model evaluation in this context, it has several limitations in its application to model selection with multiple models. First, it can only be used to test a pair of nested models with different numbers of parameters. For this reason, it cannot compare models with the same number of parameters but different functional forms, nor can it test inequality constraints on the parameters of a model. Furthermore, LRT is a null hypothesis significance testing procedure, whose goal is to assess the descriptive adequacy of a given model, measured by the p-value. Consequently, LRT does not necessarily help identify the best approximating model to the truth the hallmark of model selection. This criterion is known as generalizability or predictive accuracy in statistics and refers to a 2
3 model s ability to predict future, as yet unseen, data samples (Pitt, Myung & Zhang, 2002). Various model selection methods that measure a model s generalizability have been proposed. They include the Akaike Information Criterion (AIC: Akaike, 1973), the Bayesian Information Criterion (BIC: Schwartz, 1978) and Minimum Description Length (MDL: Grünwald, Myung & Pitt, 2005; Grünwald, 2007). 1 Among these, MDL presents itself as an attractive alternative to LRT in MPT modeling because it overcomes the aforementioned limitations of LRT. That is, MDL can be used for selecting among a set of multiple, nested or non-nested, models that might differ in model structure as well as number of parameters. In this paper, we assess and demonstrate the applicability of the MDL approach for selecting among MPT models using a well studied family of MPT models for the source monitoring paradigm. 2 Minimum Description Length The MDL principle originated from algorithmic coding theory in computer science, and it views statistical modeling as data compression (Grünwald, 2007). The best model is the one that compresses data as tightly as possible. A model s ability to compress data is measured by the shortest code length with which data can be coded with the help of the model. The resulting code length is related to generalizability such that the shorter the code length, the better the model generalizes. The most widely used implementation of the MDL principle for model selection is the Fisher Information Approximation (FIA: Rissanen, 1996). For a model with probability density function f(y θ) with parameter space Θ, it is defined as where F IA = LML + C FIA (1) LML = ln f(y ˆθ(y)) C FIA = S 2 ln N 2π + ln Θ I(θ) dθ (2) LML is the natural logarithm of the maximized likelihood function and it serves as a goodness-of-fit term, and C FIA is the term that measures the complexity of the model. 2 In the equations, S is the number of parameters, N is the sample size, I(θ) is the Fisher information matrix of sample size 1 defined as I(θ) ij = E 2 ln f(x 1 θ), i, j = 1,..., S, and θ i θ j 1 The reader is directed to two recent Journal of Mathematical Psychology special issues on model selection for discussion and example applications of these and other methods of model selection (Myung, Forster & Browne, 2000; Wagenmakers & Waldorp, 2006). 2 C FIA is derived as a large sample approximation to a full, but much harder to compute, MDL complexity measure (Grünwald, 2007), and as such, it can be inaccurate (see, e.g., Navarro, 2004). The reader should note that G 2 -based LRT is also based on large sample approximation. 3
4 I is its determinant. A smaller criterion value indicates better generalization, and thus, the model that minimizes F IA should be chosen. The FIA complexity measure, C FIA, takes into account the number of parameters (S) and the sample size (N) but also importantly, functional form captured through the Fisher information matrix. This is unlike AIC or BIC that considers only the number of parameters as a measure of model complexity. The two criteria are defined as AIC = LML + S and BIC = LML+ S 2 ln(n). The FIA has been been successfully applied to addressing various model selection issues in cognitive modeling (e.g., Lee, 2001; Pitt, Myung & Zhang, 2002; Myung, Pitt & Navarro, 2007). 3 Model Selection for MPT Source Monitoring Models In the remainder of the paper we will illustrate the application of FIA to select among momery models of source monitoring (Batchelder & Riefer, 1990; Bayen, Murnane & Erdfelder, 1996). In a source monitoring study, participants are exposed to items from two sources (A and B), and then they are required to categorize test items as from source A, B, or new (N). The data in any condition can be represented in a 3 3 table, where the rows are test item type (A, B, N) and the columns are responses categories corresponding to these types. All of the models discussed in this section are special cases of the model in Figure 1 known as the two-high threshold source monitoring model (2HTM). The model in Figure 1 has three trees corresponding to the three test item types. It has parameters for detecting the item type as old or new (the D), parameters for discriminating the source of a detected old item (the d), and guessing parameters (the a, the g, and b). In the case of MPT models, differences in functional form arise from different tree structures or different assignments of parameters or categories to the nodes of the tree. Batchelder & Riefer (1990) provided a general one-high threshold source monitoring model (1HTM), which is a special case of 2HTM by requiring D 3 = 0. They further defined seven MPT models (Batchelder & Riefer, 1990, Figure 2) obtained by equating the detection parameters (D 1 = D 2 ), the discrimination parameters (d 1 = d 2 ), and/or the guessing parameters (a = g). Figure 2 shows the C FIA curves 3 for the six of the eight models that are identified 4. The first thing to note in the figure is that complexity is generally ordered according to the number of parameters. Also note that among models with the same number of parameters, their complexity values can differ significantly from one another, sometimes even greater than the complexity difference due to the difference in the number of parameters. The case in point is the three models, 4, 5b and 5c. At the 50% value of new items, the complexity difference between model 5c and model 5b is equal to about 1.50, which is greater than the complexity difference (0.62) between models 5b and 4. In short the results demonstrate that model complexity is determined not only by the number of parameters but also importantly, by functional form (i.e., tree structure, parameter 3 The C FIA values in this paper were calculated with a MATLAB programme for arbitrary MPT models. Please contact the first author for more details. 4 An MPT model is identified if different parameter values generate different category probabilities. 4
5 assignment and category assignment), and sometimes even more so by the latter. Bayen, Murnane & Erdfelder (1996, Appendix C) published source monitoring data in a 3 (distractor similarity) 3 (source similarity) factorial experiment designed to select among different versions of the 2HTM. The goal was to experimentally manipulate conditions designed to affect the different processes represented by parameters in the model. In each condition of their study, participants listened to statements from two narrative sources (A and B) about a hypothetical missing person. Then they were asked to categorize test words as from narrator A, B, or new (N). The data collected were summarized into nine 3 3 tables, one table for each of the nine experimental conditions (Bayen, Murnane & Erdfelder, 1996, Appendix C). The purpose of their study was to evaluate the appropriateness of three different MPT models, namely the 1HT four parameter model (1HTM-4, model 4 in Batchelder & Riefer, 1990, Figure 2), the two-equal-high-threshold four-parameter model (2eHTM-4, model 4 in Bayen, Murnane & Erdfelder, 1996, Figure 4), and the low-threshold model, in modeling source monitoring data. We will focus only on the 1HT-4 and 2eHTM-4 models. Both models have four parameters (D, d, b, g) and are special cases of 2HTM assuming D 1 = D 2 (= D), d 1 = d 2 (= d) and a = g. The difference is that in 1HTM-4 there is no threshold for new items (D 3 = 0), while in 2eHTM-4 the threshold for new items is assumed equal to that of the old items (D 3 = D 1 = D 2 ). Based on signal detection considerations, six experimental hypotheses shown in Table 1 were constructed to test the validity of each model. For example, increasing distractor similarity should decrease old item detection (H 1 ), and increasing source similarity should decrease source discrimination (H 4 ), etc. To test the six hypotheses for each model, Bayen and colleagues relied on multiple pairwise LRTs. Fixing one factor to a given level, a LRT of the null hypothesis of parameter equality across the three levels of the other factor was performed. This way, a total of 18 LRTs were performed for each model. For model 2eHTM-4 all but one tests supported the six hypotheses; however, for model 1HTM-4 H 2 was not supported. Therefore the authors concluded that 2eHTM was the more valid of the two models. The results from multiple LRTs, however seemingly clear-cut, should be taken with a grain of salt. In addition to the conceptual and technical weaknesses of LRT discussed in Section 1, the following concerns should be noted. First, the results from the 18 tests, if taken as a single hypothesis, may not hold true, because the overall type I error rate was not protected. Furthermore, the parametric constraints of the 2eHTM-4 may not be realistic. In particular, the equality constraint of the three detection parameters, D 1 = D 2 = D 3 seems unnecessarily stringent. Rather, the assumption that the detection parameter D 3 for new items is different from D 1 and D 2 for old items seems more reasonable. In fact, the equality constraint was not imposed for theoretical reasons, but primarily for a technical reason, i.e., to make the model identifiable (Bayen, Murnane & Erdfelder, 1996, p. 206). We now analyze the same data set using FIA after making two modifications to the original MPT model of Bayen, Murnane & Erdfelder (1996) and also to the way in which the data were used to evaluate the model. First, our analysis is based on the 2HTM with five parameters (D, D, d, b, g), where D (= D 1 = D 2 ) is the detection parameter for both source 5
6 A and B items and D (= D 3 ) for new items. Since this model, 2HTM-5, is not identified in each separate condition of the 3 3 design, we devised a new model testing scheme that addresses this issue, which leads to the second modification: instead of testing the same model repeatedly to data from each condition, as done in Bayen, Murnane & Erdfelder (1996), we develop one super 2HTM-5 with relevant constraints on the parameters across treatments and evaluate this model just once against the entire data set. In particular, we use the equality constraints in H 2 (for both D and D ) and H 3 for this purpose. With this novel specification, the model is now identified and therefore the questionable constraint of D = D within each treatment can now be removed. It is worth mentioning that we could also construct many such super models, nested or non-nested, each corresponding to a different combination of H 1, H 4 - H 6 described earlier. With these changes implemented, we applied FIA to select among MPT models, and the results are compared to those based on AIC and BIC. A total of 15 models with various combinations of equality and inequality constraints on parameters are constructed and are then compared simultaneously with one another using FIA. The results are presented in Table 2 and 3. Note that the non-identifiable 2HTM-5 assumes five parameters for each of the nine conditions of the experimental design; however each of the equality constraints in H 2 (for both D and D ) and H 3 removes six parameters, resulting in an identified model with (5 9) (6 3) = 27 parameters for the entire data set. The resulting model is denoted by M0-0 in Table 2. This least constrained model, which nests all other models in the tables, serves as a baseline for the analysis described next. From model M0-0, the other 14 models were derived by incorporating additional equality constraints into the model. They are grouped into three subgroups: (a) 2HTMs in group M0 assume separate detection thresholds for old and new items; (b) 1HTMs in group M1 assume no detection threshold for new items (i.e., D = 0); and (c) 2eHTM in group M2 assume equal thresholds for both old and new items (i.e., D = D ). Within each subgroup, multiple models are constructed by imposing additional equality constraints in H 1, H 4 -H 6 to address relevant theoretical issues. A total of 14 models are presented for evaluation. 5 Of these, models M1-3 and M2-3 are noteworthy as they are fully compatible with H 1 - H 6 stated in Bayen, Murnane & Erdfelder (1996). Results from FIA analyses for the 15 models, as well as AIC s and BIC s are summarized in Tables 2 and 3. It should be noted that hypotheses H 1 (for both D and D ), H 4 and H 5 involves inequality constraints on the parameters when the corresponding equality constraints are not assumed. Both versions of each of the 15 models with and without those inequality constraints are included for analysis. 6 Because the inequality constraints are not violated in this data set for all 15 models, both versions give the same LML, AIC and BIC results and the only difference made by the inclusion of the inequality constraints is the FIA results. The FIA result for the version of model without inequality constraints is listed 5 Models whose fit is very poor (with LML > 239) and models that leads to very small improvement in fit when compared to a more constrained model (with LML < 1) are excluded because they will not be selected by any of the model selection criteria. Model M0-4 is not identified. S = 16 denotes the effective number of parameters. The identification condition max j d j = 1 was employed to obtain LML and C FIA. 6 This applies to b only when H 6 holds. 6
7 in the rows marked FIA, while that for the model with inequality constraints are listed in the rows marked FIAi. Among the four 1HTMs, M1-x (x =1,...,4), FIA selects model M1-3 with inequality constraints within this class. Regarding AIC and BIC, the former favors M1-3 whereas the latter favors M1-4. For the five 2HT models, all three selection criteria choose model M2-3 within this class. Between this model with and without inequality constraints, FIA favors the former. In short, M1-3 and M2-3 with inequality constraints on parameters turn out to be favored within each class. This conclusion only partially agrees with that of Bayen, Murnane & Erdfelder (1996) based on multiple LRT tests, which pointed to the 2eHTM M2-3 but not 1HTM M1-3. In addition, because both M1-3 and M2-3 are consistent with the six hypotheses, our analysis does not support their conclusion that the 2eHTM describes the source monitoring data better than the 1HTM. When we compare all 15 models, including the five in group 0, all three criteria unanimously select the 2HTM M0-4. In particular, between model M0-4 with and without inequality constraints, FIA favors the one with the constraints. The final choice of this model as the best generalizing model of all considered have two important implications for interpreting Bayen, Murnane & Erdfelder (1996) data. First, the model does not make the assumption of D = D, casting a doubt on the equality assumption made in Bayen, Murnane & Erdfelder (1996). Second, note that under model M0-4, both D and b are set to be the same for all treatments but D and d are allowed to vary with distractor and source similarity levels respectively in the desired direction. This implies that besides H 2 and H 3, which are baseline, the data do not support H 5, partially support H 1 and fully support H 4 and H 6. To summarize, the reanalysis of Bayen, Murnane & Erdfelder (1996) s data using FIA has led to the interpretations of the data that are substantively different from those based on their LRT tests. According to our conclusion, under the experimental manipulations, both 1HTM and 2eHTM exhibit the pattern of parameter change specified in the six hypotheses, and there is no evidence for favoring 2eHTM over 1HTM. Furthermore, neither D = D nor D = 0 is supported by our model selection approach. The chosen model has different thresholds for the old and new items. In this model, both guessing parameters are insensitive to experimental manipulations 7, increasing distractor similarity decreases the threshold of new items but not old items, and increasing source similarity decreases source discriminability. 4 Conclusions MDL s flexibility of application to virtually any model comparison situations that may arise in MPT modeling makes it an attractive alternative to the traditional LRT approach. It is important to note that MDL represents a model selection approach in which the models 7 If models assuming equal g over the 9 treatments are involved for comparison, FIA will choose the version of M0-4 with constant g and inequality constraints, with F IA = 201.5, lower than the second lowest FIA value by F IA =
8 in contention are ranked by their generalizability, or, equivalently, predictive accuracy, the hallmark in model selection. In the current application, hypotheses designed to test the validity of a model were specified into master models using various combinations of equality and inequality restrictions among the parameters. In this way several potentially valid models could be comparing on FIA rather than a series of LRTs based on a null hypothesis testing approach. This approach may be especially useful for choosing a valid model to use in investigating a number of future data sets of interest. In the present study we focused our investigations on FIA due to its unique advantage in describing model complexity. Other methods of MPT model selection not discussed in the present paper include the Bayes factor (BF: Kass & Raftery, 1995) and the Population-Parameter Mapping (PPM) method (Chechile, 1998). Both methods originate from Bayesian statistics, and in particular, the PPM method is unique in its way that it can handle misspecified models. Unlike FIA, however, both of these methods do not lend themselves to furnishing a stand-alone complexity measure. This is why we did not specifically include them in the present investigation. In conclusion, model selection lies at the core of the scientific inference process. Accordingly, a theoretically well-justified and widely applicable methodology can help advance science. We believe that MDL represents such a methodology that provides versatile yet powerful tools for assessing the validity of MPT models in a way that goes beyond the shortcomings of the current methods such as AIC, BIC and LRT. One final note. It is important to note that statistical model selection techniques alone, however sophisticated, are not a panacea for all inference problems. Other non-statistical means of model evaluation such as plausibility, interpretability, and explanatory adequacy are equally, if not more, important. Statistical model selection would be most useful if it were combined with the judicious use of sound subjective but scientific judgement. 8
9 References Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In Kotz and Johnson (1992): Breakthroughs in Statistics. NY: Springer Verlag. Batchelder, W. H. and Riefer, D. M. (1990). Multinomial processing models of source monitoring. Psychological Review, 97, Batchelder, W. H. and Riefer, D. M. (1999). Theoretical and empirical review of multinomial process tree modeling. Psychonomic Bulletin and Review, 6, Bayen, U. J., Murnane, K. and Erdfelder, E. (1996). Source discrimination, item detection, and multinomial models for source monitoring. Journal of Experimental Psychology: Learning, Memory, and Cognition, 22, Chechile, R.A. (1998). A new method for estimating model parameters for multinomial data. Journal of Mathematical Psychology, 42, Grünwald, P. (2007). The Minimum Description Length Principle. MIT Press. Grünwald, P., Myung, I. J., and Pitt, M. A.(2005). Advances in Minimum Description Length: Theory and Applications. MIT Press. Hu, X. and Batchelder, W. H. (1994). The statistical analysis of general processing tree modelswith the EM algorithm. Psychometrika, 59, Kass, R. E. and Raftery, A. E. (1995). Bayes factors. Journal of the American Statistical Association, 90, Knapp, B., & Batchelder, W.H. (2004). Representing parametric order constraints in multitrial applications of multinomial processing tree models. Journal of Mathematical Psychology, 2004, Lawley, D. N. and Maxwell, A. E. (1971). Factor analysis as a statistical method. Elsevier New York. Lee, M. D. (2001). On the complexity of additive clustering models. Journal of Mathematical Psychology, 45, Myung, I. J., Forster, M. R. and Browne, M. W. (2000). Guest editors introduction, special issue on model selection. Journal of Mathematical Psychology, 44, 1-2. Myung, I. J., and Pitt, M. (1997). Applying Occam s razor in modeling cognition: Bayesian approach. Psychonomic Bulletin & Review, 4(1), A Myung, I. J., Pitt, M. and Navarro, D. J. (2007). Does response scaling cause the generalized context model to mimic a prototype model? Psychonomic Bulletin & Review, 14(6),
10 Navarro, D. J. (2004). A note on the applied use of MDL approximations. Neural Computation, 16, Pitt, M. A., Myung, I. J. and Zhang, S (2002). Toward a method of selecting among computational models of cognition. Psychological Review, 109, Rissanen, J. J. (1996). Fisher information and stochastic complexity. IEEE Transactions on Information Theory, 42, Schwartz, G. (1978). Estimating the dimension of a model. Annals of Statistics 6, Wagenmakers, E. -J. and Waldorp, L. (2006). Editors introduction. Journal of Mathematical Psychology, 50, Acknowledgements This paper is based on Hao Wu s Master of the Arts thesis submitted to the Ohio State University in July It is supported in part by National Institute of Health Grant R01- MH57472 to JIM. Work on this paper by the third author was supported in part from a grant from the National Science Foundation: SES to X.Hu and W.H. Batchelder(Co-PIs). We wish to thank Mark A. Pitt and Michael W. Browne for valuable feedbacks provided for the project. Correspondence concerning this article should be addressed to Hao Wu, Department of Psychology, Ohio State University, 1835 Neil Avenue, Columbus, OH wu.498@osu.edu. 10
11 Table 1: Summary of the six hypotheses in Bayen, Murnane & Erdfelder (1996). Each parameter has two subscripts denoting levels of the two experimental factors. The first subscript denotes the level of the distractor similarity factor and the second subscript j denotes the level of the source similarity factor. H, M and L denotes High, Medium and Low levels. H 1 : D Hj < D Mj < D Lj, for all j {H, M, L} H 2 : D ih = D im = D il, for all i {H, M, L} H 3 : d Hj = d Mj = d Lj, for all j {H, M, L} H 4 : d ih < d im < d il, for all i {H, M, L} H 5 : b Hj > b Mj > b Lj, for all j {H, M, L} H 6 : b ih = b im = b il, for all i {H, M, L} 11
12 Table 2: Summary of model selection results for source monitoring data from Bayen, Murnane and Erdfelder (1996, Appendix C). The total sample size is N = 11,232. Sample size S is listed for each model. The middle part of the table summarizes the equality constraints assumed by the models. The symbols are defined as follows: s 2 stands for set to be the same across distractor similarity conditions only ; s 1 stands for set to be the same across source similarity conditions only ; s stands for set to be the same across all 9 conditions ; v stands for allowed to vary across all 9 conditions. In the row for parameter D, 0 stands for set to 0 ; D stands for fixed to parameter D in the same experimental condition. For the inequality constraints associated with FIAi, see text. The table continues to Table 3. Model M0-0 M0-1 M0-2 M0-3 M0-4 M0-5 S D s 1 s 1 s s 1 s s 1 Equality D s 1 s 1 s 1 s s 1 s constraints b v s v v s s 1 imposed d s 2 s 2 s 2 s 2 s 2 s 2 g v v v v v v -LML LM L FIA C FIA F IA FIAi C FIA F IA BIC C BIC BIC AIC C AIC AIC
13 Table 3: Continuation of Table 2. Model M1-1 M1-2 M1-3 M1-4 M2-1 M2-2 M2-3 M2-4 M2-5 S D s 1 s s 1 s s 1 s s 1 s s 1 Equality D D D D D D constraints b v v s 1 s 1 v v s 1 s 1 s imposed d s 2 s 2 s 2 s 2 s 2 s 2 s 2 s 2 s 2 g v v v v v v v v v -LML LM L FIA C FIA F IA FIAi C FIA F IA BIC C BIC BIC AIC C AIC AIC
14 Figure 1: The two-high-threshold multinomial processing tree model of source monitoring. Adapted from Bayen, Murnane & Erdfelder (1996, Figure 3).The parameters are defined as follows: D 1 (detectability of source A items); D 2 (detectability of source B items); d 1 (source discriminability of source A items); d 2 (source discriminability of source B items); a (guessing that a detected but nondiscriminated item belongs to source A category); b (guessing old to a nondetected item); g (guessing that a nondetected item biased as old belongs to source A category). 14
15 Figure 2: C FIA complexity curves for six source-monitoring models from (Batchelder & Riefer, 1990, Figure 2) plotted as a function of the percentage of new items for the total sample size of n Old + n New = Sample sizes for the two sources are the same. The six models differ from one another in the number of parameters but also in functional form. The number of parameters of each model is indicated by the first digit of the corresponding label in the legend. 15
Annotated multitree output
Annotated multitree output A simplified version of the two high-threshold (2HT) model, applied to two experimental conditions, is used as an example to illustrate the output provided by multitree (version
More informationModel Comparison Methods
Model Comparison Methods In J. Myung and Mark A. Pitt Department of Psychology Ohio State University Columbus, Ohio 43210-1222 {myung.1, pitt.2}@osu.edu March 10, 2003 To appear in L. Brand and M. L. Johnson
More informationComparison Methods. (How to Evaluate Model Performance) Department of Psychology. Cognitive Science (August 2010)
Tutorial on Model Comparison Methods (How to Evaluate Model Performance) Jay Myung & Mark Pitt Department of Psychology Ohio State University 1 Model Comparison in Cognitive Modeling How should one decide
More informationarxiv: v1 [stat.ap] 10 Nov 2014
Order Constraints 1 Running head: Order Constraints Parametric Order Constraints in Multinomial Processing Tree Models: An Extension of Knapp and Batchelder (2004) arxiv:1411.2571v1 [stat.ap] 10 Nov 2014
More informationTaking into account sampling variability of model selection indices: A parametric bootstrap approach. Sunthud Pornprasertmanit, Wei Wu, Todd D.
Parametric Bootstrap in Nonnested Model Comparison 1 Taking into account sampling variability of model selection indices: A parametric bootstrap approach Sunthud Pornprasertmanit, Wei Wu, Todd D. Little
More informationAN MDL FRAMEWORK FOR DATA CLUSTERING
Complex Systems Computation Group (CoSCo) http://cosco.hiit.fi/ AN MDL FRAMEWORK FOR DATA CLUSTERING Petri Myllymäki Complex Systems Computation Group (CoSCo) Helsinki Institute for Information Technology
More informationMPTinR: Analysis of Multinomial Processing Tree Models in R
MPTinR: Analysis of Multinomial Processing Tree Models in R Henrik Singmann University of Zurich Albert-Ludwigs-Universität Freiburg David Kellen University of Basel Albert-Ludwigs-Universität Freiburg
More informationRKWard: IRT analyses and person scoring with ltm
Software Corner Software Corner: RKWard: IRT analyses and person scoring with ltm Aaron Olaf Batty abatty@sfc.keio.ac.jp Keio University Lancaster University In SRB 16(2), I introduced the ever-improving,
More informationComparing and Combining Dichotomous and Polytomous Items with SPRT Procedure in Computerized Classification Testing
Comparing and Combining Dichotomous and Polytomous Items with SPRT Procedure in Computerized Classification Testing C. Allen Lau Harcourt Brace Educational Measurement Tianyou Wang ACT Paper presented
More informationBootstrapping Methods
Bootstrapping Methods example of a Monte Carlo method these are one Monte Carlo statistical method some Bayesian statistical methods are Monte Carlo we can also simulate models using Monte Carlo methods
More informationA noninformative Bayesian approach to small area estimation
A noninformative Bayesian approach to small area estimation Glen Meeden School of Statistics University of Minnesota Minneapolis, MN 55455 glen@stat.umn.edu September 2001 Revised May 2002 Research supported
More informationExamining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating
Research Report ETS RR 12-09 Examining the Impact of Drifted Polytomous Anchor Items on Test Characteristic Curve (TCC) Linking and IRT True Score Equating Yanmei Li May 2012 Examining the Impact of Drifted
More informationA Model Selection Criterion for Classification: Application to HMM Topology Optimization
A Model Selection Criterion for Classification Application to HMM Topology Optimization Alain Biem IBM T. J. Watson Research Center P.O Box 218, Yorktown Heights, NY 10549, USA biem@us.ibm.com Abstract
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email: mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More information( ylogenetics/bayesian_workshop/bayesian%20mini conference.htm#_toc )
(http://www.nematodes.org/teaching/tutorials/ph ylogenetics/bayesian_workshop/bayesian%20mini conference.htm#_toc145477467) Model selection criteria Review Posada D & Buckley TR (2004) Model selection
More informationExam Advanced Data Mining Date: Time:
Exam Advanced Data Mining Date: 11-11-2010 Time: 13.30-16.30 General Remarks 1. You are allowed to consult 1 A4 sheet with notes written on both sides. 2. Always show how you arrived at the result of your
More informationExpectation Maximization (EM) and Gaussian Mixture Models
Expectation Maximization (EM) and Gaussian Mixture Models Reference: The Elements of Statistical Learning, by T. Hastie, R. Tibshirani, J. Friedman, Springer 1 2 3 4 5 6 7 8 Unsupervised Learning Motivation
More informationSandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing
Generalized Additive Model and Applications in Direct Marketing Sandeep Kharidhi and WenSui Liu ChoicePoint Precision Marketing Abstract Logistic regression 1 has been widely used in direct marketing applications
More informationStatistical matching: conditional. independence assumption and auxiliary information
Statistical matching: conditional Training Course Record Linkage and Statistical Matching Mauro Scanu Istat scanu [at] istat.it independence assumption and auxiliary information Outline The conditional
More informationTopics in Machine Learning-EE 5359 Model Assessment and Selection
Topics in Machine Learning-EE 5359 Model Assessment and Selection Ioannis D. Schizas Electrical Engineering Department University of Texas at Arlington 1 Training and Generalization Training stage: Utilizing
More informationSTATISTICS (STAT) Statistics (STAT) 1
Statistics (STAT) 1 STATISTICS (STAT) STAT 2013 Elementary Statistics (A) Prerequisites: MATH 1483 or MATH 1513, each with a grade of "C" or better; or an acceptable placement score (see placement.okstate.edu).
More informationText Modeling with the Trace Norm
Text Modeling with the Trace Norm Jason D. M. Rennie jrennie@gmail.com April 14, 2006 1 Introduction We have two goals: (1) to find a low-dimensional representation of text that allows generalization to
More informationCS 664 Flexible Templates. Daniel Huttenlocher
CS 664 Flexible Templates Daniel Huttenlocher Flexible Template Matching Pictorial structures Parts connected by springs and appearance models for each part Used for human bodies, faces Fischler&Elschlager,
More informationHierarchical Mixture Models for Nested Data Structures
Hierarchical Mixture Models for Nested Data Structures Jeroen K. Vermunt 1 and Jay Magidson 2 1 Department of Methodology and Statistics, Tilburg University, PO Box 90153, 5000 LE Tilburg, Netherlands
More information7. Decision or classification trees
7. Decision or classification trees Next we are going to consider a rather different approach from those presented so far to machine learning that use one of the most common and important data structure,
More informationChallenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track
Challenges on Combining Open Web and Dataset Evaluation Results: The Case of the Contextual Suggestion Track Alejandro Bellogín 1,2, Thaer Samar 1, Arjen P. de Vries 1, and Alan Said 1 1 Centrum Wiskunde
More informationDynamic Thresholding for Image Analysis
Dynamic Thresholding for Image Analysis Statistical Consulting Report for Edward Chan Clean Energy Research Center University of British Columbia by Libo Lu Department of Statistics University of British
More informationControl Invitation
Online Appendices Appendix A. Invitation Emails Control Invitation Email Subject: Reviewer Invitation from JPubE You are invited to review the above-mentioned manuscript for publication in the. The manuscript's
More informationSemi-Supervised Clustering with Partial Background Information
Semi-Supervised Clustering with Partial Background Information Jing Gao Pang-Ning Tan Haibin Cheng Abstract Incorporating background knowledge into unsupervised clustering algorithms has been the subject
More informationCOPULA MODELS FOR BIG DATA USING DATA SHUFFLING
COPULA MODELS FOR BIG DATA USING DATA SHUFFLING Krish Muralidhar, Rathindra Sarathy Department of Marketing & Supply Chain Management, Price College of Business, University of Oklahoma, Norman OK 73019
More informationA ew Algorithm for Community Identification in Linked Data
A ew Algorithm for Community Identification in Linked Data Nacim Fateh Chikhi, Bernard Rothenburger, Nathalie Aussenac-Gilles Institut de Recherche en Informatique de Toulouse 118, route de Narbonne 31062
More informationCOLOR FIDELITY OF CHROMATIC DISTRIBUTIONS BY TRIAD ILLUMINANT COMPARISON. Marcel P. Lucassen, Theo Gevers, Arjan Gijsenij
COLOR FIDELITY OF CHROMATIC DISTRIBUTIONS BY TRIAD ILLUMINANT COMPARISON Marcel P. Lucassen, Theo Gevers, Arjan Gijsenij Intelligent Systems Lab Amsterdam, University of Amsterdam ABSTRACT Performance
More information4.12 Generalization. In back-propagation learning, as many training examples as possible are typically used.
1 4.12 Generalization In back-propagation learning, as many training examples as possible are typically used. It is hoped that the network so designed generalizes well. A network generalizes well when
More informationA Formal Approach to Score Normalization for Meta-search
A Formal Approach to Score Normalization for Meta-search R. Manmatha and H. Sever Center for Intelligent Information Retrieval Computer Science Department University of Massachusetts Amherst, MA 01003
More informationSamuel Coolidge, Dan Simon, Dennis Shasha, Technical Report NYU/CIMS/TR
Detecting Missing and Spurious Edges in Large, Dense Networks Using Parallel Computing Samuel Coolidge, sam.r.coolidge@gmail.com Dan Simon, des480@nyu.edu Dennis Shasha, shasha@cims.nyu.edu Technical Report
More information7. Collinearity and Model Selection
Sociology 740 John Fox Lecture Notes 7. Collinearity and Model Selection Copyright 2014 by John Fox Collinearity and Model Selection 1 1. Introduction I When there is a perfect linear relationship among
More informationComparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio
Comparative analysis of data mining methods for predicting credit default probabilities in a retail bank portfolio Adela Ioana Tudor, Adela Bâra, Simona Vasilica Oprea Department of Economic Informatics
More informationGUIDELINES FOR MASTER OF SCIENCE INTERNSHIP THESIS
GUIDELINES FOR MASTER OF SCIENCE INTERNSHIP THESIS Dear Participant of the MScIS Program, If you have chosen to follow an internship, one of the requirements is to write a Thesis. This document gives you
More informationPolynomial Curve Fitting of Execution Time of Binary Search in Worst Case in Personal Computer
Polynomial Curve Fitting of Execution Time of Binary Search in Worst Case in Personal Computer Dipankar Das Assistant Professor, The Heritage Academy, Kolkata, India Abstract Curve fitting is a well known
More informationOptimal Experimental Design for Model Discrimination
Optimal Experimental Design for Model Discrimination Jay I. Myung and Mark A. Pitt Ohio State University June 3, 2009 (in press in Psychological Review) Abstract Models of a psychological process can be
More informationIDENTIFICATION AND ELIMINATION OF INTERIOR POINTS FOR THE MINIMUM ENCLOSING BALL PROBLEM
IDENTIFICATION AND ELIMINATION OF INTERIOR POINTS FOR THE MINIMUM ENCLOSING BALL PROBLEM S. DAMLA AHIPAŞAOĞLU AND E. ALPER Yıldırım Abstract. Given A := {a 1,..., a m } R n, we consider the problem of
More informationSpectral Clustering and Community Detection in Labeled Graphs
Spectral Clustering and Community Detection in Labeled Graphs Brandon Fain, Stavros Sintos, Nisarg Raval Machine Learning (CompSci 571D / STA 561D) December 7, 2015 {btfain, nisarg, ssintos} at cs.duke.edu
More informationLearning Graph Grammars
Learning Graph Grammars 1 Aly El Gamal ECE Department and Coordinated Science Laboratory University of Illinois at Urbana-Champaign Abstract Discovery of patterns in a given graph - in the form of repeated
More informationFACILITY LIFE-CYCLE COST ANALYSIS BASED ON FUZZY SETS THEORY Life-cycle cost analysis
FACILITY LIFE-CYCLE COST ANALYSIS BASED ON FUZZY SETS THEORY Life-cycle cost analysis J. O. SOBANJO FAMU-FSU College of Engineering, Tallahassee, Florida Durability of Building Materials and Components
More informationEstimation of Bilateral Connections in a Network: Copula vs. Maximum Entropy
Estimation of Bilateral Connections in a Network: Copula vs. Maximum Entropy Pallavi Baral and Jose Pedro Fique Department of Economics Indiana University at Bloomington 1st Annual CIRANO Workshop on Networks
More informationDual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys
Dual-Frame Sample Sizes (RDD and Cell) for Future Minnesota Health Access Surveys Steven Pedlow 1, Kanru Xia 1, Michael Davern 1 1 NORC/University of Chicago, 55 E. Monroe Suite 2000, Chicago, IL 60603
More informationTrajectory Compression under Network Constraints
Trajectory Compression under Network Constraints Georgios Kellaris, Nikos Pelekis, and Yannis Theodoridis Department of Informatics, University of Piraeus, Greece {gkellar,npelekis,ytheod}@unipi.gr http://infolab.cs.unipi.gr
More information10601 Machine Learning. Model and feature selection
10601 Machine Learning Model and feature selection Model selection issues We have seen some of this before Selecting features (or basis functions) Logistic regression SVMs Selecting parameter value Prior
More informationFSRM Feedback Algorithm based on Learning Theory
Send Orders for Reprints to reprints@benthamscience.ae The Open Cybernetics & Systemics Journal, 2015, 9, 699-703 699 FSRM Feedback Algorithm based on Learning Theory Open Access Zhang Shui-Li *, Dong
More informationThe Encoding Complexity of Network Coding
The Encoding Complexity of Network Coding Michael Langberg Alexander Sprintson Jehoshua Bruck California Institute of Technology Email mikel,spalex,bruck @caltech.edu Abstract In the multicast network
More informationRandom projection for non-gaussian mixture models
Random projection for non-gaussian mixture models Győző Gidófalvi Department of Computer Science and Engineering University of California, San Diego La Jolla, CA 92037 gyozo@cs.ucsd.edu Abstract Recently,
More informationSeparating Objects and Clutter in Indoor Scenes
Separating Objects and Clutter in Indoor Scenes Salman H. Khan School of Computer Science & Software Engineering, The University of Western Australia Co-authors: Xuming He, Mohammed Bennamoun, Ferdous
More informationLearning to Match. Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li
Learning to Match Jun Xu, Zhengdong Lu, Tianqi Chen, Hang Li 1. Introduction The main tasks in many applications can be formalized as matching between heterogeneous objects, including search, recommendation,
More informationCPSC 532L Project Development and Axiomatization of a Ranking System
CPSC 532L Project Development and Axiomatization of a Ranking System Catherine Gamroth cgamroth@cs.ubc.ca Hammad Ali hammada@cs.ubc.ca April 22, 2009 Abstract Ranking systems are central to many internet
More informationNotes and Comment. Bias in exponential and power function fits due to noise: Comment on Myung, Kim, and Pitt
Memory & Cognition 2003, 31 (4), 656-661 Notes and Comment Bias in exponential and power function fits due to noise: Comment on Myung, Kim, and Pitt SCOTT BROWN University of California, Irvine, California
More informationWorkshop 8: Model selection
Workshop 8: Model selection Selecting among candidate models requires a criterion for evaluating and comparing models, and a strategy for searching the possibilities. In this workshop we will explore some
More informationThe fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R
Journal of Machine Learning Research (2013) Submitted ; Published The fastclime Package for Linear Programming and Large-Scale Precision Matrix Estimation in R Haotian Pang Han Liu Robert Vanderbei Princeton
More informationLecture 19. Lecturer: Aleksander Mądry Scribes: Chidambaram Annamalai and Carsten Moldenhauer
CS-621 Theory Gems November 21, 2012 Lecture 19 Lecturer: Aleksander Mądry Scribes: Chidambaram Annamalai and Carsten Moldenhauer 1 Introduction We continue our exploration of streaming algorithms. First,
More informationIdentifying Layout Classes for Mathematical Symbols Using Layout Context
Rochester Institute of Technology RIT Scholar Works Articles 2009 Identifying Layout Classes for Mathematical Symbols Using Layout Context Ling Ouyang Rochester Institute of Technology Richard Zanibbi
More informationCognitive Principles and Reasoning Clusters in Multinomial Process Tree: A Case Study for Human Syllogistic Reasoning
ognitive Principles and Reasoning lusters in Multinomial Process Tree: A ase Study for Human Syllogistic Reasoning Orianne Bargain Technische Universität Dresden April 11, 2018 Table of ontents Preliminaries
More informationSupplementary text S6 Comparison studies on simulated data
Supplementary text S Comparison studies on simulated data Peter Langfelder, Rui Luo, Michael C. Oldham, and Steve Horvath Corresponding author: shorvath@mednet.ucla.edu Overview In this document we illustrate
More informationI How does the formulation (5) serve the purpose of the composite parameterization
Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)
More informationQuery Likelihood with Negative Query Generation
Query Likelihood with Negative Query Generation Yuanhua Lv Department of Computer Science University of Illinois at Urbana-Champaign Urbana, IL 61801 ylv2@uiuc.edu ChengXiang Zhai Department of Computer
More informationDefinition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos
Definition, Detection, and Evaluation of Meeting Events in Airport Surveillance Videos Sung Chun Lee, Chang Huang, and Ram Nevatia University of Southern California, Los Angeles, CA 90089, USA sungchun@usc.edu,
More informationSubjective Randomness and Natural Scene Statistics
Subjective Randomness and Natural Scene Statistics Ethan Schreiber (els@cs.brown.edu) Department of Computer Science, 115 Waterman St Providence, RI 02912 USA Thomas L. Griffiths (tom griffiths@berkeley.edu)
More informationSIMULTANEOUS COMPUTATION OF MODEL ORDER AND PARAMETER ESTIMATION FOR ARX MODEL BASED ON MULTI- SWARM PARTICLE SWARM OPTIMIZATION
SIMULTANEOUS COMPUTATION OF MODEL ORDER AND PARAMETER ESTIMATION FOR ARX MODEL BASED ON MULTI- SWARM PARTICLE SWARM OPTIMIZATION Kamil Zakwan Mohd Azmi, Zuwairie Ibrahim and Dwi Pebrianti Faculty of Electrical
More informationModel Comparison in Psychology
Model Comparison in Psychology Jay I. Myung # and Mark A. Pitt Department of Psychology Ohio State University Columbus, OH 43210 {myung.1, pitt.2}@osu.edu # Corresponding author April 28, 2016 Word count:
More informationInstantaneously trained neural networks with complex inputs
Louisiana State University LSU Digital Commons LSU Master's Theses Graduate School 2003 Instantaneously trained neural networks with complex inputs Pritam Rajagopal Louisiana State University and Agricultural
More informationA Comparison of Error Metrics for Learning Model Parameters in Bayesian Knowledge Tracing
A Comparison of Error Metrics for Learning Model Parameters in Bayesian Knowledge Tracing Asif Dhanani Seung Yeon Lee Phitchaya Phothilimthana Zachary Pardos Electrical Engineering and Computer Sciences
More informationWhat is machine learning?
Machine learning, pattern recognition and statistical data modelling Lecture 12. The last lecture Coryn Bailer-Jones 1 What is machine learning? Data description and interpretation finding simpler relationship
More informationAnalysis of missing values in simultaneous. functional relationship model for circular variables
Analysis of missing values in simultaneous linear functional relationship model for circular variables S. F. Hassan, Y. Z. Zubairi and A. G. Hussin* Centre for Foundation Studies in Science, University
More informationInformation Criteria Methods in SAS for Multiple Linear Regression Models
Paper SA5 Information Criteria Methods in SAS for Multiple Linear Regression Models Dennis J. Beal, Science Applications International Corporation, Oak Ridge, TN ABSTRACT SAS 9.1 calculates Akaike s Information
More informationDistributed Detection in Sensor Networks: Connectivity Graph and Small World Networks
Distributed Detection in Sensor Networks: Connectivity Graph and Small World Networks SaeedA.AldosariandJoséM.F.Moura Electrical and Computer Engineering Department Carnegie Mellon University 5000 Forbes
More informationA Bayesian approach to artificial neural network model selection
A Bayesian approach to artificial neural network model selection Kingston, G. B., H. R. Maier and M. F. Lambert Centre for Applied Modelling in Water Engineering, School of Civil and Environmental Engineering,
More informationFramework for replica selection in fault-tolerant distributed systems
Framework for replica selection in fault-tolerant distributed systems Daniel Popescu Computer Science Department University of Southern California Los Angeles, CA 90089-0781 {dpopescu}@usc.edu Abstract.
More information3 Graphical Displays of Data
3 Graphical Displays of Data Reading: SW Chapter 2, Sections 1-6 Summarizing and Displaying Qualitative Data The data below are from a study of thyroid cancer, using NMTR data. The investigators looked
More informationCatering to Your Tastes: Using PROC OPTEX to Design Custom Experiments, with Applications in Food Science and Field Trials
Paper 3148-2015 Catering to Your Tastes: Using PROC OPTEX to Design Custom Experiments, with Applications in Food Science and Field Trials Clifford Pereira, Department of Statistics, Oregon State University;
More informationSampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation
Sampling informative/complex a priori probability distributions using Gibbs sampling assisted by sequential simulation Thomas Mejer Hansen, Klaus Mosegaard, and Knud Skou Cordua 1 1 Center for Energy Resources
More informationSTA121: Applied Regression Analysis
STA121: Applied Regression Analysis Variable Selection - Chapters 8 in Dielman Artin Department of Statistical Science October 23, 2009 Outline Introduction 1 Introduction 2 3 4 Variable Selection Model
More informationA SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS INTRODUCTION
A SOCIAL NETWORK ANALYSIS APPROACH TO ANALYZE ROAD NETWORKS Kyoungjin Park Alper Yilmaz Photogrammetric and Computer Vision Lab Ohio State University park.764@osu.edu yilmaz.15@osu.edu ABSTRACT Depending
More informationA Statistical Analysis of UK Financial Networks
A Statistical Analysis of UK Financial Networks J. Chu & S. Nadarajah First version: 31 December 2016 Research Report No. 9, 2016, Probability and Statistics Group School of Mathematics, The University
More information2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006
2386 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. 52, NO. 6, JUNE 2006 The Encoding Complexity of Network Coding Michael Langberg, Member, IEEE, Alexander Sprintson, Member, IEEE, and Jehoshua Bruck,
More informationStatistical Analysis of List Experiments
Statistical Analysis of List Experiments Kosuke Imai Princeton University Joint work with Graeme Blair October 29, 2010 Blair and Imai (Princeton) List Experiments NJIT (Mathematics) 1 / 26 Motivation
More informationNovel Lossy Compression Algorithms with Stacked Autoencoders
Novel Lossy Compression Algorithms with Stacked Autoencoders Anand Atreya and Daniel O Shea {aatreya, djoshea}@stanford.edu 11 December 2009 1. Introduction 1.1. Lossy compression Lossy compression is
More informationDeep Generative Models Variational Autoencoders
Deep Generative Models Variational Autoencoders Sudeshna Sarkar 5 April 2017 Generative Nets Generative models that represent probability distributions over multiple variables in some way. Directed Generative
More informationRobust Shape Retrieval Using Maximum Likelihood Theory
Robust Shape Retrieval Using Maximum Likelihood Theory Naif Alajlan 1, Paul Fieguth 2, and Mohamed Kamel 1 1 PAMI Lab, E & CE Dept., UW, Waterloo, ON, N2L 3G1, Canada. naif, mkamel@pami.uwaterloo.ca 2
More informationThe Cross-Entropy Method
The Cross-Entropy Method Guy Weichenberg 7 September 2003 Introduction This report is a summary of the theory underlying the Cross-Entropy (CE) method, as discussed in the tutorial by de Boer, Kroese,
More informationCOMPARING MODELS AND CURVES. Fitting models to biological data using linear and nonlinear regression. A practical guide to curve fitting.
COMPARING MODELS AND CURVES An excerpt from a forthcoming book: Fitting models to biological data using linear and nonlinear regression. A practical guide to curve fitting. Harvey Motulsky GraphPad Software
More informationChapter 6 Continued: Partitioning Methods
Chapter 6 Continued: Partitioning Methods Partitioning methods fix the number of clusters k and seek the best possible partition for that k. The goal is to choose the partition which gives the optimal
More informationAN APPROXIMATION APPROACH FOR RANKING FUZZY NUMBERS BASED ON WEIGHTED INTERVAL - VALUE 1.INTRODUCTION
Mathematical and Computational Applications, Vol. 16, No. 3, pp. 588-597, 2011. Association for Scientific Research AN APPROXIMATION APPROACH FOR RANKING FUZZY NUMBERS BASED ON WEIGHTED INTERVAL - VALUE
More informationChapter 10. Conclusion Discussion
Chapter 10 Conclusion 10.1 Discussion Question 1: Usually a dynamic system has delays and feedback. Can OMEGA handle systems with infinite delays, and with elastic delays? OMEGA handles those systems with
More informationRobust Signal-Structure Reconstruction
Robust Signal-Structure Reconstruction V. Chetty 1, D. Hayden 2, J. Gonçalves 2, and S. Warnick 1 1 Information and Decision Algorithms Laboratories, Brigham Young University 2 Control Group, Department
More informationFrequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM
Frequently Asked Questions Updated 2006 (TRIM version 3.51) PREPARING DATA & RUNNING TRIM * Which directories are used for input files and output files? See menu-item "Options" and page 22 in the manual.
More informationExploring Econometric Model Selection Using Sensitivity Analysis
Exploring Econometric Model Selection Using Sensitivity Analysis William Becker Paolo Paruolo Andrea Saltelli Nice, 2 nd July 2013 Outline What is the problem we are addressing? Past approaches Hoover
More informationAn Approach to Identify the Number of Clusters
An Approach to Identify the Number of Clusters Katelyn Gao Heather Hardeman Edward Lim Cristian Potter Carl Meyer Ralph Abbey July 11, 212 Abstract In this technological age, vast amounts of data are generated.
More informationA Connection between Network Coding and. Convolutional Codes
A Connection between Network Coding and 1 Convolutional Codes Christina Fragouli, Emina Soljanin christina.fragouli@epfl.ch, emina@lucent.com Abstract The min-cut, max-flow theorem states that a source
More informationFurther Applications of a Particle Visualization Framework
Further Applications of a Particle Visualization Framework Ke Yin, Ian Davidson Department of Computer Science SUNY-Albany 1400 Washington Ave. Albany, NY, USA, 12222. Abstract. Our previous work introduced
More informationOne-mode Additive Clustering of Multiway Data
One-mode Additive Clustering of Multiway Data Dirk Depril and Iven Van Mechelen KULeuven Tiensestraat 103 3000 Leuven, Belgium (e-mail: dirk.depril@psy.kuleuven.ac.be iven.vanmechelen@psy.kuleuven.ac.be)
More informationModel selection Outline for today
Model selection Outline for today The problem of model selection Choose among models by a criterion rather than significance testing Criteria: Mallow s C p and AIC Search strategies: All subsets; stepaic
More informationProbability Evaluation in MHT with a Product Set Representation of Hypotheses
Probability Evaluation in MHT with a Product Set Representation of Hypotheses Johannes Wintenby Ericsson Microwave Systems 431 84 Mölndal, Sweden johannes.wintenby@ericsson.com Abstract - Multiple Hypothesis
More information