A Two-Level Approach to Making Class Predictions

Size: px
Start display at page:

Download "A Two-Level Approach to Making Class Predictions"

Transcription

1 A Two-Level Approach to Makig Class Predictios Adria Costea Turku Cetre for Computer Sciece ad IAMSR / Åbo Akademi Uiversity, Turku, Filad, Adria.Costea@abo.fi Tomas Eklud Turku Cetre for Computer Sciece ad IAMSR / Åbo Akademi Uiversity, Turku, Filad, Tomas.Eklud@abo.fi Abstract I this paper we propose a ew two-level methodology for assessig coutries /compaies ecoomic/fiacial performace. The methodology is based o two major techiques of groupig data: cluster aalysis ad predictive classificatio models. First we use cluster aalysis i terms of self-orgaizig maps to fid possible clusters i data i terms of ecoomic/fiacial performace. We the iterpret the maps ad defie outcome values (classes) for each data row. Lastly we build classifiers usig two differet predictive models (multiomial logistic regressio ad decisio trees) ad compare the accuracy of these models. Our fidigs claim that the results of the two classificatio techiques are similar i terms of accuracy rate ad class predictios. Furthermore, we focus our efforts o uderstadig the decisio process correspodig to the two predictive models. Moreover, we claim that our methodology, if correctly implemeted, exteds the applicability of the self-orgaizig map for clusterig of fiacial data, ad thereby, for fiacial aalysis. 1. Itroductio I this study, we are iterested i the relatioship betwee a umber of macro/microecoomic idicators of coutries/compaies ad differet ecoomic/fiacial performace classificatios. We have based our research o two previous studies [2] ad [3]. I [2] we compared two differet methods of clusterig cetral-east Europea coutries ecoomic data (self-orgaizig maps ad statistical clusterig) ad preseted the advatages ad disadvatages of each method. I [3], the self-orgaizig map (SOM) was used for bechmarkig iteratioal pulp ad paper compaies. I both previous studies we were maily cocered with fidig patters i ecoomic/fiacial data ad presetig this multidimesioal data i a easy-to-read format (usig SOM maps). However, we have ot addressed the problem of class predictio as ew cases are added to our datasets. From our previous results we caot directly ifer a procedure with which a ew data row could be fit ito our maps. As we obtai ew data, depedig upo the stadardizatio techique used, we may be forced to retrai the maps, ad repeat the etire clusterig process. This is very time cosumig, ad requires the effort of a experieced SOM user. As Witte & Frak say i their book o data miig: The success of clusterig is measured subjectively i terms of how useful the result appears to be to a huma user. It may be followed by a secod step of classificatio learig where rules are leared that give a itelligible descriptio of how ew istaces should be placed ito the clusters. [17, p.39] Here we propose a methodology that eables us to model the relatioship betwee ecoomic/fiacial variables ad differet classificatios of coutries/compaies i terms of their performaces. Defiig the model permits us to predict the class (cluster) to which a ew case belogs. I other words, we isert ew data ito our model ad idetify where they fit i the previously costructed map. Choosig the best techique for these two phases of our aalysis (clusterig/bechmarkig/visualizatio ad class predictio) is ot a trivial task. I the literature there is a large umber of techiques for both clusterig ad class predictio. I this study, we use SOM as the clusterig techique due to the advatages of good visualizatio ad reduced computatioal cost. Eve with a relatively small umber of samples, may clusterig algorithms especially hierarchical oes (for example, Uweighted Pair Group Method with Arithmetic Mea (UPGMA), Ward s, or other bottom-up hierarchical clusterig methods) become itractably heavy [16]. Descriptive techiques, such as clusterig, simply summarize data i coveiet ways, or i ways that we hope will lead to icreased uderstadig. I cotrast, predictive techiques, such as multiomial logistic regressio ad decisio trees, allow us to predict the probability that data rows will be clustered i a specific class i the traied SOM model. I order to fid the predictive techique that is most suitable i our particular case, we coduct two experimets usig multiomial logistic regressio ad decisio tree techiques. Whe buildig real classifiers oe ca use three differet

2 fudametal approaches: the discrimiative approach, the regressio approach, ad the class-coditioal approach [6, p.335]. We chose to compare two regressio approach methods: multiomial logistic regressio ad decisio trees. The rest of the paper is structured as follows. I Sectio two we preset our methodology. I Sectio three, the datasets are preseted ad SOM clusterig is performed. I Sectios four ad five, the multiomial regressio ad decisio tree models are built ad validated, ad i Sectio six the models are compared. Fially, i Sectio seve, we preset our coclusios. 2. Methodology I our two-level approach we add aother level (class predictio phase) to SOM clusterig, as is depicted i Figure 1 (the arrows are the levels): Iitial dataset Data i (1) form of SOM (2) Figure 1. Two-level methodology Data predictio model (1) cosists of several stages: preprocessig of iitial data, traiig usig the SOM algorithm, choosig the best maps, idetifyig the clusters, ad attachig outcome values to each data row; [1] (2) depedig o the techique that we apply, there ca be differet stages for this methodology level. Whe applyig statistical techiques, such as multiomial logistic regressio, we follow these steps: developig the aalysis pla, estimatio of logistic regressio, assessig model fit (accuracy), iterpretig the results, ad validatig the model. Whe applyig the decisio tree algorithm: costructig a decisio tree step by step icludig oe attribute at a time i the model, assessig model accuracy, iterpretig the results, ad validatig the model. After the predictive models for classificatio were costructed we compared them, based o their accuracy measures. Quila [10] states that there are differet ways of comparig models besides their accuracy, e.g. the isight provided by the predictive model. However, we will use the accuracy measure sice the example above is a subjective measure. 3. Clusterig Usig SOM The SOM algorithm stads for self-orgaizig map algorithm, ad is based o a two-layer eural etwork usig the usupervised learig method. The selforgaizig map techique creates a two-dimesioal map from -dimesioal iput data. This map resembles a ladscape i which it is possible to idetify borders that defie differet clusters [8]. These clusters cosist of iput variables with similar characteristics, i.e. i this report, of coutries/compaies with similar ecoomic/fiacial performace. The methodology used whe applyig the self-orgaizig map is as follows [1]. First, we choose the data material. It is ofte advisable to stadardize the iput data so that the learig task of the etwork becomes easier [8]. After this, we choose the etwork topology, learig rate, ad eighborhood radius. The, the etwork is costructed. The costructio process takes place by showig the iput data to the etwork iteratively usig the same iput vector may times, the so-called traiig legth. The process eds whe the average quatizatio error is small eough. The best map is chose for further aalysis. Fially, we idetify the clusters usig the U- matrix ad iterpret the clusters (assig labels to them) usig the feature plaes. From the feature plaes we ca read per iput variable per euro the value of the variable associated with each euro. The etwork topology refers to the form of the lattice. There are two commoly used lattices, rectagular ad hexagoal. The hexagoal lattice is preferable for visualizatio purposes as it has six eighbors, as opposed to four for the rectagular lattice [8]. The learig rate refers to how much the wiig iput data vector affects the surroudig etwork. The eighborhood radius refers to how much of the surroudig etwork is affected. The average quatizatio error idicates the average distace betwee the best matchig uits ad the iput data vectors. Geerally speakig, a lower quatizatio error idicates a better-traied map. The sample data size is ot of a major cocer whe usig SOM algorithm. I [15] the author claims that SOM is easily applicable to small data sets (less tha records) but ca also be applied i case of medium sized data sets. To visualize the fial self-orgaizig map we use the uified distace matrix method (U-matrix). The U-matrix method ca be used to discover otherwise ivisible relatioships i a high-dimesioal data space. It also makes it possible to classify data sets ito clusters of similar values. The simplest U-matrix method is to calculate the distaces betwee eighborig euros, ad store them i a matrix, i.e. the output map, which the ca be iterpreted. If there are walls betwee the euros, the eighborig weights are distat, i.e. the values differ sigificatly. The distace values ca also be displayed i color whe the U-matrix is visualized. Hece, dark colors represet great distaces while brighter colors idicate similarities amogst the euros. [14] 3.1. Datasets I this study we have used two datasets from our previous papers: oe dataset o the geeral ecoomic performace (EcoomicPerf) of the cetral-east-europea coutries [2] ad aother (FiacialPerf) o the fiacial

3 performace of iteratioal pulp ad paper compaies [3]. The variables for the first dataset are: Currecy Value, or how much moey oe ca buy with 1000 USD, depicts the purchasig power of each coutry s currecy (the greater the better), Domestic Prime Rate (Refiacig Rate), which shows fiacial performace ad level of ivestmet opportuities (the smaller the better), Idustrial Output i percetages to the previous periods, to depict idustrial ecoomical developmet (the greater the better), Uemploymet Rate, which characterizes the social situatio i the coutry (the smaller the better), ad Foreig Trade i millios of US dollars, to reveal the deficit/surplus of the trade budget (the greater the better). I [2] there were two more variables i the dataset: import ad export i millio USD, as itermediary measures to calculate the foreig trade. We did ot take them ito accout here, sice they are strogly correlated with the foreig trade variable. Also, we have replaced the first variable (Foreig Exchage Rate) from the previous study [2] with Currecy Value, which is calculated from the Foreig Exchage Rate variable by reversig it ad multiplyig the result with We have chaged this variable to esure the comparability amog differet coutries currecies. Our dataset cotais mothly/aual data for six coutries (Russia, Ukraie, Romaia, Polad, Sloveia ad Latvia) durig , i total 225 cases with five variables each. We have i some cases ecoutered lack of data, which we have completed usig meas of existig values. However, the self-orgaizig map algorithm ca treat the problem of missig data simply by cosiderig at each learig step oly those idicators that are available [7]. The secod dataset cosisted of fiacial data o iteratioal pulp ad paper compaies. The dataset covered the period , ad cosisted of seve fiacial ratios per year for each compay. The ratios were chose from a empirical study by Lehtie [9], i which a umber of fiacial ratios were evaluated cocerig their validity ad reliability i a iteratioal cotext. The ratios chose were: Operatig margi, a profitability ratio, Retur o Equity, a profitability ratio, Retur o Total Assets, a profitability ratio, Quick Ratio, a liquidity ratio, Equity to Capital, a solvecy ratio, Iterest Coverage, a solvecy ratio, ad Receivables Turover, a efficiecy ratio. The ratios were calculated based o iformatio from the compaies aual reports. The dataset cosisted of 77 compaies ad 7 regioal averages. The compaies were chose from Pulp ad Paper Iteratioal s aual rakig of pulp ad paper compaies accordig to et sales [12]. I total, the dataset cosisted of 474 rows of data Choosig the Best Maps The two datasets were stadardized accordig to differet methods. I [2] the authors used the stadard deviatios of each variable to stadardize the data (Equatios 1, 2), while i [3] the data have bee scaled usig histogram equalizatio [4]. It is ot our itetio to describe differet methods for the stadardizatio of datasets; however, i the literature there are examples of both stadardizatio techiques used o similar datasets. x x ij j= i = 1 [Eq. 1] 2 ( xij xi ) j= 1 σ i = [Eq. 2] We have traied differet maps with differet parameters. As is stated i [2] a good map is obtaied after several differet traiig sessios. Best maps have bee chose based o two measures: oe objective measure (the quatizatio error) ad a subjective measure (ease of readability). However, the algorithm quatizatio error seems to be positively correlated with the dimesio of the maps, while ease of readability is egatively correlated. I other words, we ca obtai very good maps i terms of their quatizatio error if we use large dimesio parameters, while they are poor i terms of readability. Cluster aalysis is ofte a trade-off betwee accuracy ad cluster clarity ad maageability, by creatig small maps we force the data ito larger clusters. Cosequetly, whe we compared the maps we restricted the maps dimesios to be costat. The chose maps ad their clusters are preseted i Figure Idetifyig the Clusters We idetify the clusters o the maps by studyig the fial U-matrix maps (Figure 1), the feature plaes, ad at the same time, by lookig at the row data. Actually, the title of this paragraph, idetifyig the clusters, should be idetifyig the clusters of clusters. What we are sayig is that we already have the clusters idetified by SOM o the map (from ow o we will refer to these clusters as row clusters). For example, i case we are usig a 7x5 map, we have 35 row clusters. Next we have to idetify the real clusters by groupig the row clusters. SOM helps us i this respect by drawig darker lies betwee two clusters that are far from each other (i terms of the Euclidea distace). The results for both datasets were

4 very similar i terms of the amout, ad characteristics, of clusters (7 i each case). (a) (b) Figure 2. (a) The fial U-matrix maps ad (b) idetified clusters o the maps for the EcoomicPerf ad FiacialPerf data sets 3.4. Defiig the Outcome Values for each Row Data Roughly speakig, we ca state that the outcome values (the classes) i terms of ecoomic/fiacial performace, were the same i both cases (Figure 1), so the classes are as follows: A best performace, B slightly below best performace, C slightly above average performace, D average, E slightly below average performace, F slightly above poorest performace, ad G poorest performace. Defiig the outcome values for each data row is a straightforward process. Oce we figure out which cluster each row cluster belogs to, the ext step is to check which row data vectors are associated with each row cluster, ad to associate the class code with those vectors. Cosequetly, i terms of methodology, we ca divide the clusterig process ito two parts: creatig the row-clusters this part is etirely doe by the SOM algorithm, the output beig the U-matrix; creatig the real clusters this part is doe by the map reader with the help of the SOM algorithm i terms of visualizatio characteristics. This kid of multi-level clusterig approach is ot ew. A two-level SOM clusterig approach has bee suggested before, i [16]. There, the row-clusters are protoclusters ad our real clusters are the actual clusters. However, sometimes it is difficult to fid good real clusters sice the secod part of the clusterig process is highly subjective. Also, the stadardizatio method has a importat role, sice for differet stadardizatio techiques we obtai differet maps i terms of G E B B A C A C E D D F F G quatizatio error ad ease of readability. 4. Applyig multiomial logistic regressio I geeral, whe multiomial logistic regressio is applied as a predictive modelig techique for classificatio, there are some steps that have to be followed: 1. Check the requiremets regardig the data sample: size, missig data, etc., 2. Compute the multiomial logistic regressio usig a available software program (e.g. SPSS), 3. Assess the model fit (accuracy), 4. Iterpret the results, ad 5. Validate the model. Below, we follow this methodology whe applyig logistic regressio o our datasets Requiremets I the EcoomicPerf dataset, the problem of missig data was overcome by usig mothly meas for each year. Averages were also used for missig data i the FiacialPerf dataset. The requiremet of size, cases for each idepedet variable, was exceeded for each dataset Computig the Multiomial Regressio Model We use SPSS to perform multiomial regressio aalysis selectig as depedet variables the class variables ad as covariates the variables preseted i Sectio Assessig the Model Fit From the Model Fittig iformatio output table of SPSS we observe that the chi-square value has a sigificace of < , so we state that there is a strog relatioship betwee depedet ad idepedet variables (see Table 2). Next, we study the Pseudo R-Square table i SPSS, which also idicates the stregth betwee depedet ad idepedet variables. A good model fit is idicated by higher values. We will base our aalysis o the Nagelkerke R 2 idicator (see Table 2). Accordig to this, 74.5% for the EcoomicPerf dataset ad 97.8% for the FiacialPerf dataset, of the output variatio ca be explaied by variatios i iput variables. Cosequetly, we would appreciate the relatioships as very strog. To evaluate the accuracy of the model, we compute the proportioal by chace accuracy rate ad the maximum by chace accuracy rate. The proportioal chace criterio for assessig model fit is calculated by summig the squared proportio of each group i the sample, ad the maximum chace criterio is the proportio of cases i the largest

5 group. We obtaied the followig idicators (Table 1): Table 1. Evaluate the model's accuracy Model Proportioal by Maximum by chace chace criterio criterio EcoomicPerf 61,3% 29,92% 49,8% FiacialPerf 88% 15,62% 20,46% We iterpret these umbers as follows: for example, i the case of the EcoomicPerf dataset, based o the requiremet that the model accuracy should be 25% better tha the chace criteria [5, p ], the stadard to use for comparig the model's accuracy is 1.25 x = Our model accuracy rate of 61.3% exceeds this stadard. The maximum chace criterio accuracy rate is 49.8% for this dataset. Based o the requiremet that model accuracy should be 25% better tha the chace criteria, the stadard to use for comparig the model's accuracy is 1.25 x 49.8% = 62.22%. Our model accuracy rate of 61.3% is slightly below this stadard. The FiacialPerf dataset accuracy rate exceeds both stadards Iterpretig the Results To iterpret the results of our aalysis, we study the Likelihood Ratio Test ad Parameter Estimates outputs of SPSS. We fid that the idepedet variables are all sigificat, i other words they cotribute sigificatly to explaiig differeces i performace classificatio (for both datasets). However, ot all variables play a importat role i all regressio equatios (e.g. for the first regressio equatio, CurrecyValue is ot statistically sigificat 0,125 > p = 0,05). Next, we ca determie the directio of the relatioship ad the cotributio to performace classificatio of each idepedet variable by lookig at colums B ad exp(b) from the Parameter Estimates" output of SPSS. For example, a higher idustrial output rate icreases the likelihood that the coutry will be classified as a best coutry (B = +24,027) ad decreases the likelihood that the coutry will be classified amog the poorest coutries (B = -11,137). It seems that the results for the EcoomicPerf dataset are poorer, i the sese that for the FiacialPerf dataset we have more coefficiets estimates that are statistically sigificat. For example, if we study the Parameter Estimates outputs of SPSS ( Sig. colum), we fid that EcoomicPerf dataset has 33% sigificat coefficiets, while FiacialPerf dataset has 62.5% Validatig the Model I order to validate the model, we split the datasets i two parts of, approximately, the same legth. Our fidigs are illustrated i Table 2: Table 2. Datasets accuracy rates ad accuracy rates estimators whe applyig multiomial logistic regressio EcoomicPerf FiacialPerf Model Chi- Square (p < 0,0001) Mai dataset With oe exceptio, we obtaied sigificat coefficiets for the logistic regressio equatios. I both cases, the accuracy rates of the two split datasets were close to the accuracy rate of the etire dataset. For example, 89% ad 89,5% are close to the etire FiacialPerf dataset accuracy rate of 88%. Agai, the secod dataset outperformed the first oe, i the sese that for the FiacialPerf dataset, the accuracy rates for the test samples are closer to the learig sample accuracy rate. However, more ivestigatios should be doe to fid problems that arise due to isigificat coefficiets of each regressio equatio. Large stadard errors for B coefficiets ca be caused by multicolliearity amog idepedet variables, which is ot directly hadled by SPSS or other statistical packages. Moreover, the problem of outliers ad variable selectio should be carefully addressed. Also, the discrepacies betwee learig ad test accuracy rates ca arise due to the small sizes of the datasets. The larger the dataset is, the better the chace that we have correctly clustered data ad, cosequetly, correct outcome values for each data row. We costruct the outcome values based o SOM clusterig. There is, of course, a chace that there are misclustered data, which ca affect the accuracy of the model Predictig the Classes Part1 (split=0) The fiished model was the used to test the classificatio of three ew data rows for the FiacialPerf 1 this coefficiets is sigificat for p < 0,153. Part2 (spli=1) 291, , ,852 Nagelkerke R 2 0,745 0,855 0,721 Learig 61,3% 67% 58,4% Test Sigificat coefficiets (p<0,05) Model Chi- Square (p < 0,0001) o test sample ALL 57,6% 67,1% ALL except: CURRENCY 1 ALL 1479,72 792,06 752,85 Nagelkerke R 2 0,978 0,986 0,981 Learig 88% 89% 89,5% Test Sigificat coefficiets (p<0,001) o test sample 76,1% 82,4% ALL ALL ALL

6 dataset. These cosisted of data for three Fiish pulp ad paper compaies: M-Real (o. 3), Stora Eso (o. 4), ad UPM-Kymmee (o. 5), for the year These were used sice they were amog the first to publish their fiacial results. The results are illustrated i Table 3. Operatig Margi Table 3. Predictios usig multiomial logistic regressio ROE ROTA Equity to Capital Quick Ratio Iterest Receivables Compay Predicted Coverage Turover o. Cluster D B A Table 4. The first lie, for each dataset, represets the accuracy rates obtaied usig traiig datasets. The ext two lies show us the validatio accuracy rates calculated as follows: for the mai dataset a 10-crossvalidatio was coducted (64% beig the average accuracy rate of 10 decisio trees), for the split=0 dataset we used split=1 as test dataset (46,9% is the accuracy rate o the secod dataset, based o the decisio tree built with the first dataset), ad the last accuracy rate was calculated by cosiderig split=1 as the traiig dataset ad split=0 as the test dataset (chagig the roles). 5. Applyig the Decisio Tree Algorithm For compariso reasos, a See5 decisio tree builder system was applied o both datasets. The system was developed by a research team headed by Quila. The algorithm behid the program is based o oe of the most popular decisio tree algorithms, ad was developed i the late 70 s, also by Quila: ID3 [11]. The mai idea is that, at each step, the algorithm tries to select a variable ad a value associated with it that discrimiate best the dataset, ad does this recursively for each subset util all the cases from all subsets belog to a certai class. The method is called Top-Dow Iductio Of Decisio Trees (TDIDT) ad C4.5, C5.0/See5 represet differet implemetatios of this method. The best discrimiatig pair (variable-value) is chose based o so-called gai ratio criterio: gai ratio(x) = gai(x) / split ifo(x) [Eq. 3] where gai(x) meas the iformatio gaied by splittig the data usig the test X ad: split ifo (X) = i= 1 S S i 2 Si S log [Eq. 4] represets the potetial iformatio geerated by dividig S ito subsets. The See5 system implemets these formulas alog with some other features that are described i [11] ad o the web page Computig the Decisio Tree For both datasets, we performed three rus of the See5 software, exactly like we did whe applyig logistic regressio: oe for the whole dataset, aother usig first split dataset ( split=0 ), ad the other usig the secod half of data ( split=1 ). Whe validatig the etire dataset accuracy rate, we have used cross-validatio, while whe validatig oe split dataset accuracy rate we have used the other oe as test sample. The results are summarized i Table 4. Dataset accuracy rates ad accuracy rates estimators whe applyig decisio tree algorithm EcoomicPerf FiacialPerf Learig Test crossvalidatio Learig Test crossvalidatio Mai dataset Whe costructig the trees, we kept the two most importat parameters costat: m = 5, which measures the miimum umber of cases each leaf-ode should have, ad c = 25% (default value) that is a cofidece factor used i pruig the tree Assessig the Model Fit Part1 Part2 79,1% 77,7% 78,86% o test sample 64% 46,9% 54,5% o crossvalidatio o crossvalidatio 84,8% 86,5% 86,5% 74,6% 71,7% 76,8% 74,4% o crossvalidatio o crossvalidatio For the EcoomicPerf dataset, it seems that our trees were ot cosistet due to poor accuracy rates ad big discrepacies betwee learig ad test accuracy rates, so further compariso with regressio aalysis caot be performed i this case. There is at least a 10% differece betwee the accuracy rates for each split dataset used. For the FiacialPerf dataset, the differeces betwee accuracy rates are smaller. Therefore, we used this dataset for further ivestigatio. The chose decisio tree is preseted i the Appedix. Readig it we ca state that the mai attribute used to discrimiate the data was ROE. The lower that we go dow i the decisio tree, the less importat the attributes become. At each step the algorithm calculates the iformatio gai for each attribute choosig the split attribute with the largest iformatio gai we call it the most importat attribute.

7 5.3. Iterpretig the Results As we ca see from the decisio tree (Appedix), the secod most importat variable depeds upo the values of ROE: if our ROE is greater tha or equal to , it is Equity to Capital, while if ROE is less tha or equal to , it is Receivables Turover. We must ote that we have used fuzzy thresholds, which allows for a much more flexible decisio tree: the algorithm (C5.0) assigs a lower value (lv) ad a upper value (uv) for each attribute chose to split the data. The a membership fuctio (trapezoidal) is used to decide which brach of the tree will be followed whe a ew case has to be classified. If the value of the splittig attribute for the ew case is lower tha lv, the left brach will be followed, ad if it is greater tha uv the we will further use the right brach. If the value lies betwee lv ad uv, both braches of the tree are ivestigated ad the results combied probabilistically the brach with the highest probability will be followed Validatig the Model Notice the asymmetric threshold values for almost every splittig attribute. I this case (FiacialPerf), the accuracy rate of the test sample is comparable with the accuracy rate of the learig sample. There is o specificatio o how close these two values should be; cosequetly, we coclude that the tree is validated. The oly way to really validate the assumptio that the two accuracy rates are ot far from oe aother is to cosider the two accuracy rates as radom variables ad the use a statistic test to see if their meas differ sigificatly. This ew step i validatig the decisio tree model would require splittig the dataset i differet ways to obtai differet traiig ad test datasets, ad the, uder the assumptio that the accuracy rates are radom variables that follow ormal distributio, which is ot always the case, we would test if their meas are or are ot statistically differet. After traiig the decisio tree, we tested it o the same data rows used i Sectio Predictig the Classes The results are illustrated i Table 5. As ca be see i the table, the results are somewhat differet from those obtaied usig logistic regressio. Table 5..Predictio usig the decisio tree Operatig Margi ROE ROTA Equity to Capital Quick Ratio Iterest Coverage Receivables Turover M-Real (o.3) was classified as a D compay i Table 3, while it is a B compay i table 5. The data rows of Stora Eso ad M-real are geerally similar, but the decisio tree has placed more emphasis o ROE, while logistic regressio seems to have emphasized Equity to Capital. Also, we ca see from Table 6 that the decisio tree has ot quite correctly leared the patter associated with Group D, oly beig able to correctly classify 58% of the cases i this group. The logistical regressio model was much more successful, ad we therefore cosider its predictio the more reliable of the two. More study will be eeded to judge why this happeed. 6. Comparig the Classificatio Models Accuracy While this is ot the oly way to compare two classificatio techiques, comparig them usig accuracy rates is the most used. I [10] the author compared five predictive models from areas of both machie learig ad statistics. A compariso similar to ours was made i [13]. The authors compared logistic regressio ad decisio tree iductio i the diagosis of Carpal Tuel sydrome. Their fidigs claim that there is o sigificat differece betwee the two methods i terms of model accuracy rates. Also, they suggest that the classificatio accuracy of the bivariate models (two idepedet variables) is slightly higher tha that of multivariate oes. It is ot our goal to compare bivariate ad multivariate models, while this ca be a subject for further ivestigatios usig the datasets preseted i this paper. As we stated i sectio 5, we will cosider oly the secod dataset whe comparig the two methods, sice for the first dataset the results were very poor i terms of the accuracy rate. I the last sectio, we will try to explai why we obtaied such poor results usig the EcoomicPerf dataset. Coversely, i the case of the secod dataset (FiacialPerf) both logistic regressio ad decisio tree models were validated agaist the split datasets. The differeces betwee accuracy rates were smaller i this case, ad the learig dataset accuracy rates were very good (88% ad 84,8%). Also, both models performed similarly o the test datasets (89%, 89,5% ad 86,5%, 86,5%). The bigger differece for the traiig datasets could be caused by the fact that whe applyig the decisio tree algorithm, we split the data i two parts usig 75% of the rows for the learig dataset. The remaiig 25% was used as a test dataset. This was due to a umberof-rows restrictio i the See5 demo-software (max 400 Compay Predicted o. Cluster B B A rows of data). Usig logistic regressio, chages i accuracy rates ca occur whe icludig/excludig some variables i/from the model. I the case of the decisio tree, the accuracy rate of the model ca be tued usig model parameters, e.g. the miimum umber of

8 cases i each leaf (m) or the pruig cofidece factor (c). The accuracy rates for the two methods are illustrated i Table 6. Table 6. The observed accuracy rates of the two methods Logistic Regressio Observed a b c d e f g a 88% 6% 2% 4% b 5% 89% 3% 2% c 6% 6% 77% 4% 4% 2% d 6% 2% 84% 8% e 7% 1% 88% 4% f 11% 89% g 3% 97% Predicted Decisio Tree Observed a b c d e f g a 86% 10% 4% b 4% 87% 5% 1% 3% c 3% 8% 76% 5% 8% d 0% 18% 6% 58% 12% 6% e 2% 93% 2% 4% f 3% 94% 3% g 2% 4% 4% 90% Predicted 7. Discussio ad coclusios I this study, we have proposed a ew two-level approach for makig class predictios about coutries /compaies ecoomic/fiacial performace. We have applied our methodology o two datasets: the EcoomicPerf dataset that icludes variables describig the ecoomic performace of cetral-east Europea coutries durig , ad the FiacialPerf dataset, which icludes fiacial ratios describig the fiacial performace of iteratioal pulp ad paper compaies durig Firstly, SOM clusterig was applied o both datasets i order to idetify clusters i terms of ecoomic/fiacial performace, ad the optimal umber of clusters to cosider. By readig the SOM output (Umatrix maps), we have cosidered seve to be the most appropriate umber of clusters for both datasets. Cosequetly, we costruct the outcome values for each data row based o the SOM maps ad the correspodig seve classes: best, slightly below best, slightly above average, average, slightly below average, slightly above poor, ad poorest. Secodly, based o the ew datasets (updated with the outcome values), we have predicted to which class a ew iput belogs. We chose ad compared two predictive models for classificatio: logistic regressio ad decisio tree iductio. Why is this approach importat? Why combie clusterig ad classificatio techiques? Why ot directly costruct the outcome values ad apply the predictive models without performig ay clusterig? We could perform surveys, askig experts how their compay/coutry performed i differet moths or years, ad the directly apply the classificatio techique to develop predictio models as ew cases are to be classified. First of all, this kid of iformatio (outcome values for each data row) is ot easy to get (is costly), ad secodly, eve if we have it, i order for it to be useful, it has to be "true" ad "comparable". What we mea by "true" is that whe performig surveys, the respodets ca be subjective, givig higher rakigs for their coutry/compay (ot givig true aswers). The outcome values ca be u-"comparable" if, for example, oe perso has differet criteria for the term best performace tha aother. I the best perspective, whe aswerig our questios about their coutry/compay performaces the respodets would, most probably, classify their coutry/compay usig their kowledge ad iteral aggregate iformatio. We thik our methodology is a objective way of makig class predictios about coutries /compaies performaces sice, usig it, we ca choose the correct umber of clusters, defie the outcome values for each data row, ad costruct the predictive model. Also, the problem of isertig ew data ito a existig model is solved usig this method. The problem is that we ormally have to trai ew maps every time, or stadardize the ew data accordig to the variace of the old dataset, i order to add ew labels to the maps. Isertig ew data ito a existig SOM model becomes a problem whe the data have bee stadardized, for example, withi a iterval like [0,1]. Also, the retraiig of maps requires cosiderable time ad expertise. We propose that our methodology solves these problems associated with addig ew data to a existig SOM cluster model. The results show that our methodology ca be successful, if it is correctly implemeted. Clusterig is very importat i our methodology, sice we defie the outcome values for each data row based o it. Our U- matrix maps clearly show seve idetifiable clusters. More ivestigatios should be performed o fidig the utility of each clusterig or, i other words, defie "how well" we clustered the data. To evaluate the maps we used two criteria: the average quatizatio error ad the ease-ofreadability of each map. As a further research problem, we would try to develop a ew measure, or use a existig oe, to validate the clusterig. Whe applyig logistic regressio, we obtaied models with acceptable accuracy rates. All the coefficiets of all regressio equatios were statistically sigificat except oe (CURRENCY for the

9 EcoomicPerf dataset). The accuracy rates were evaluated usig two criteria: proportioal by chace criterio ad maximum by chace criterio. The first dataset s accuracy rate did't satisfy the secod criterio. Whe comparig the two classificatio techiques, we therefore oly took ito cosideratio the results of the secod. However, like i [13] our fidigs claim that the results of the two classificatio techiques are similar i terms of accuracy rate. Also, whe makig predictios usig the two models, we used data for the FiacialPerf dataset from year Two out of three ew data rows were classified i the same class usig both predictive models (Stora Eso ad UPM-Kymmee to classes 2 ad 1 respectively). A improvemet to our methodology would be to tackle the problem of variable selectio for both the clusterig ad the classificatio phases, fidig a ew way to measure clusterig utility, ad geeralizig the methodology. As further research, we will ivestigate differet methods of improvig our classificatio models. Ackowledgemets The authors would like to thak Professor Barbro Back for her costructive commets o the article. Refereces [1] B. Back, K. Sere, ad H. Vaharata, Maagig Complexity i Large Data Bases Usig Self-Orgaizig Maps, Accoutig Maagemet ad Iformatio Techologies 8 (4), Elsevier Sciece Ltd, Oxford, 1998, pp [2] A. Costea, A. Kloptcheko, ad B. Back, Aalyzig Ecoomical Performace of Cetral-East-Europea Coutries Usig Neural Networks ad Cluster Aalysis, i Proceedigs of the Fifth Iteratioal Symposium o Ecoomic Iformatics, I. Iva. ad I. Rosca (eds), Bucharest, Romaia, May, 2001, pp [3] T. Eklud, B. Back, H. Vaharata, ad A. Visa, Assessig the Feasibility of Self-Orgaizig Maps for Data Miig Fiacial Iformatio, i Proceedigs of the Xth Europea Coferece o Iformatio Systems (ECIS 2002), Gdask, Polad, Jue 6-8, 2002, pp [4] J. F. Hair, Jr, R. Aderso, ad R. L. Tatham, Multivariate Data Aalysis with readigs. Secod Editio. Macmilla Publishig Compay, New York, New York, [5] D. Had, H. Maila, ad P. Smyth, Priciples of Data Miig, MIT Press, Cambridge, [6] S. Kaski ad T. Kohoe, Exploratory Data Aalysis by the Self-Orgaizig Map: Structures of Welfare ad Poverty i the World, i Neural Networks i Fiacial Egieerig, N. Apostolos, N. Refees, Y. Abu-Mostafa, J. Moody, ad A. Weiged. (Eds), World Scietific, Sigapore, 1996, pp [7] J. P. Guiver ad C. C. Klimasauskas, Applyig Neural Networks, Part IV: Improvig Performace, PC/AI Magazie 5 (4), Phoeix, Arizoa, 1991, pp [8] T. Kohoe, Self-Orgaizig Maps, 2d editio, Spriger- Verlag, Heidelberg, [9] J. Lehtie, Fiacial Ratios i a Iteratioal Compariso, Acta Wasaesia 49, Vasa, [10] J. R. Quila, A Case Study i Machie Learig, i Proceedigs of ACSC-16 Sixteeth Australia Computer Sciece Coferece, Brisbae, Ja. 1993, pp [11] J. R. Quila, C4.5 Programs for Machie Learig, Morga Kaufma Series i Machie Learig, Morga Kaufma Publishers, Sa Mateo, [12] J. Rhiao, C. Jewitt, L. Galasso, ad G. Fortemps, Cosolidatio Chages the Shape of the Top 150, Pulp ad Paper Iteratioal 43 (9), Paperloop, Sa Fracisco, Califoria, 2001, pp [13] S. Rudolfer, G. Paliouras, ad I. Peers, A Compariso of Logistic Regressio to Decisio Tree Iductio i the Diagosis of Carpal Tuel Sydrome, Computers ad Biomedical Research 32, Academic Press, 1999, [14] A. Ultsch, Self orgaized feature plaes for moitorig ad kowledge acquisitio of a chemical process, i Proceedigs of the Iteratioal Coferece o Artificial Neural Networks, Spriger-Verlag, Lodo, 1993, pp [15] J. Vesato Neural Network Tool for Data Miig: SOM Toolbox, i Proceedigs of Symposium o Tool Eviromets ad Developmet Methods for Itelliget Systems (TOOLMET2000), Oulu yliopistopaio, Oulu, Filad, 2000, pp [16] J. Vesato ad E. Alhoiemi, Clusterig of the Self- Orgaizig Map, IEEE Trasactios o Neural Networks 11 (3), IEEE Neural Networks Society, Piscataway, New Jersey, 2000, pp [17] I. Witte ad E. Frak, Data Miig: Practical Machie Learig Tools ad Techiques with Java Implemetatios, Academic Press, Sa Diego, Appedix: the decisio tree

3D Model Retrieval Method Based on Sample Prediction

3D Model Retrieval Method Based on Sample Prediction 20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer

More information

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process

Euclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig

More information

A new algorithm to build feed forward neural networks.

A new algorithm to build feed forward neural networks. A ew algorithm to build feed forward eural etworks. Amit Thombre Cetre of Excellece, Software Techologies ad Kowledge Maagemet, Tech Mahidra, Pue, Idia Abstract The paper presets a ew algorithm to build

More information

Ones Assignment Method for Solving Traveling Salesman Problem

Ones Assignment Method for Solving Traveling Salesman Problem Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:

More information

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c

Pruning and Summarizing the Discovered Time Series Association Rules from Mechanical Sensor Data Qing YANG1,a,*, Shao-Yu WANG1,b, Ting-Ting ZHANG2,c Advaces i Egieerig Research (AER), volume 131 3rd Aual Iteratioal Coferece o Electroics, Electrical Egieerig ad Iformatio Sciece (EEEIS 2017) Pruig ad Summarizig the Discovered Time Series Associatio Rules

More information

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.

Our second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees. Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for

More information

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today

Administrative UNSUPERVISED LEARNING. Unsupervised learning. Supervised learning 11/25/13. Final project. No office hours today Admiistrative Fial project No office hours today UNSUPERVISED LEARNING David Kauchak CS 451 Fall 2013 Supervised learig Usupervised learig label label 1 label 3 model/ predictor label 4 label 5 Supervised

More information

Normal Distributions

Normal Distributions Normal Distributios Stacey Hacock Look at these three differet data sets Each histogram is overlaid with a curve : A B C A) Weights (g) of ewly bor lab rat pups B) Mea aual temperatures ( F ) i A Arbor,

More information

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON

A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work

More information

Pattern Recognition Systems Lab 1 Least Mean Squares

Pattern Recognition Systems Lab 1 Least Mean Squares Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig

More information

1 Enterprise Modeler

1 Enterprise Modeler 1 Eterprise Modeler Itroductio I BaaERP, a Busiess Cotrol Model ad a Eterprise Structure Model for multi-site cofiguratios are itroduced. Eterprise Structure Model Busiess Cotrol Models Busiess Fuctio

More information

DATA MINING II - 1DL460

DATA MINING II - 1DL460 DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,

More information

Analysis of Documents Clustering Using Sampled Agglomerative Technique

Analysis of Documents Clustering Using Sampled Agglomerative Technique Aalysis of Documets Clusterig Usig Sampled Agglomerative Techique Omar H. Karam, Ahmed M. Hamad, ad Sheri M. Moussa Abstract I this paper a clusterig algorithm for documets is proposed that adapts a samplig-based

More information

Image Segmentation EEE 508

Image Segmentation EEE 508 Image Segmetatio Objective: to determie (etract) object boudaries. It is a process of partitioig a image ito distict regios by groupig together eighborig piels based o some predefied similarity criterio.

More information

Bayesian approach to reliability modelling for a probability of failure on demand parameter

Bayesian approach to reliability modelling for a probability of failure on demand parameter Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee

More information

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method

A New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro

More information

New HSL Distance Based Colour Clustering Algorithm

New HSL Distance Based Colour Clustering Algorithm The 4th Midwest Artificial Itelligece ad Cogitive Scieces Coferece (MAICS 03 pp 85-9 New Albay Idiaa USA April 3-4 03 New HSL Distace Based Colour Clusterig Algorithm Vasile Patrascu Departemet of Iformatics

More information

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov

Sorting in Linear Time. Data Structures and Algorithms Andrei Bulatov Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio

More information

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.

Empirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic. Empirical Validate C&K Suite for Predict Fault-Proeess of Object-Orieted Classes Developed Usig Fuzzy Logic. Mohammad Amro 1, Moataz Ahmed 1, Kaaa Faisal 2 1 Iformatio ad Computer Sciece Departmet, Kig

More information

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis

Analysis Metrics. Intro to Algorithm Analysis. Slides. 12. Alg Analysis. 12. Alg Analysis Itro to Algorithm Aalysis Aalysis Metrics Slides. Table of Cotets. Aalysis Metrics 3. Exact Aalysis Rules 4. Simple Summatio 5. Summatio Formulas 6. Order of Magitude 7. Big-O otatio 8. Big-O Theorems

More information

New Fuzzy Color Clustering Algorithm Based on hsl Similarity

New Fuzzy Color Clustering Algorithm Based on hsl Similarity IFSA-EUSFLAT 009 New Fuzzy Color Clusterig Algorithm Based o hsl Similarity Vasile Ptracu Departmet of Iformatics Techology Tarom Compay Bucharest Romaia Email: patrascu.v@gmail.com Abstract I this paper

More information

IMP: Superposer Integrated Morphometrics Package Superposition Tool

IMP: Superposer Integrated Morphometrics Package Superposition Tool IMP: Superposer Itegrated Morphometrics Package Superpositio Tool Programmig by: David Lieber ( 03) Caisius College 200 Mai St. Buffalo, NY 4208 Cocept by: H. David Sheets, Dept. of Physics, Caisius College

More information

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5

Morgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5 Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:

More information

Investigating methods for improving Bagged k-nn classifiers

Investigating methods for improving Bagged k-nn classifiers Ivestigatig methods for improvig Bagged k-nn classifiers Fuad M. Alkoot Telecommuicatio & Navigatio Istitute, P.A.A.E.T. P.O.Box 4575, Alsalmia, 22046 Kuwait Abstract- We experimet with baggig knn classifiers

More information

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb

( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most

More information

Evaluation scheme for Tracking in AMI

Evaluation scheme for Tracking in AMI A M I C o m m u i c a t i o A U G M E N T E D M U L T I - P A R T Y I N T E R A C T I O N http://www.amiproject.org/ Evaluatio scheme for Trackig i AMI S. Schreiber a D. Gatica-Perez b AMI WP4 Trackig:

More information

Harris Corner Detection Algorithm at Sub-pixel Level and Its Application Yuanfeng Han a, Peijiang Chen b * and Tian Meng c

Harris Corner Detection Algorithm at Sub-pixel Level and Its Application Yuanfeng Han a, Peijiang Chen b * and Tian Meng c Iteratioal Coferece o Computatioal Sciece ad Egieerig (ICCSE 015) Harris Corer Detectio Algorithm at Sub-pixel Level ad Its Applicatio Yuafeg Ha a, Peijiag Che b * ad Tia Meg c School of Automobile, Liyi

More information

Image based Cats and Possums Identification for Intelligent Trapping Systems

Image based Cats and Possums Identification for Intelligent Trapping Systems Volume 159 No, February 017 Image based Cats ad Possums Idetificatio for Itelliget Trappig Systems T. A. S. Achala Perera School of Egieerig Aucklad Uiversity of Techology New Zealad Joh Collis School

More information

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals

UNIT 4 Section 8 Estimating Population Parameters using Confidence Intervals UNIT 4 Sectio 8 Estimatig Populatio Parameters usig Cofidece Itervals To make ifereces about a populatio that caot be surveyed etirely, sample statistics ca be take from a SRS of the populatio ad used

More information

Accuracy Improvement in Camera Calibration

Accuracy Improvement in Camera Calibration Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z

More information

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana

The Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:

More information

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le

Fundamentals of Media Processing. Shin'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dinh Le Fudametals of Media Processig Shi'ichi Satoh Kazuya Kodama Hiroshi Mo Duy-Dih Le Today's topics Noparametric Methods Parze Widow k-nearest Neighbor Estimatio Clusterig Techiques k-meas Agglomerative Hierarchical

More information

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only

Bezier curves. Figure 2 shows cubic Bezier curves for various control points. In a Bezier curve, only Edited: Yeh-Liag Hsu (998--; recommeded: Yeh-Liag Hsu (--9; last updated: Yeh-Liag Hsu (9--7. Note: This is the course material for ME55 Geometric modelig ad computer graphics, Yua Ze Uiversity. art of

More information

c-dominating Sets for Families of Graphs

c-dominating Sets for Families of Graphs c-domiatig Sets for Families of Graphs Kelsie Syder Mathematics Uiversity of Mary Washigto April 6, 011 1 Abstract The topic of domiatio i graphs has a rich history, begiig with chess ethusiasts i the

More information

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence

9.1. Sequences and Series. Sequences. What you should learn. Why you should learn it. Definition of Sequence _9.qxd // : AM Page Chapter 9 Sequeces, Series, ad Probability 9. Sequeces ad Series What you should lear Use sequece otatio to write the terms of sequeces. Use factorial otatio. Use summatio otatio to

More information

Python Programming: An Introduction to Computer Science

Python Programming: An Introduction to Computer Science Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists

More information

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve

Analysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao

More information

Arithmetic Sequences

Arithmetic Sequences . Arithmetic Sequeces COMMON CORE Learig Stadards HSF-IF.A. HSF-BF.A.1a HSF-BF.A. HSF-LE.A. Essetial Questio How ca you use a arithmetic sequece to describe a patter? A arithmetic sequece is a ordered

More information

Improving Template Based Spike Detection

Improving Template Based Spike Detection Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for

More information

4.3 Modeling with Arithmetic Sequences

4.3 Modeling with Arithmetic Sequences Name Class Date 4.3 Modelig with Arithmetic Sequeces Essetial Questio: How ca you solve real-world problems usig arithmetic sequeces? Resource Locker Explore Iterpretig Models of Arithmetic Sequeces You

More information

Designing a learning system

Designing a learning system CS 75 Itro to Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@pitt.edu 539 Seott Square, -5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please

More information

The isoperimetric problem on the hypercube

The isoperimetric problem on the hypercube The isoperimetric problem o the hypercube Prepared by: Steve Butler November 2, 2005 1 The isoperimetric problem We will cosider the -dimesioal hypercube Q Recall that the hypercube Q is a graph whose

More information

Algorithms for Disk Covering Problems with the Most Points

Algorithms for Disk Covering Problems with the Most Points Algorithms for Disk Coverig Problems with the Most Poits Bi Xiao Departmet of Computig Hog Kog Polytechic Uiversity Hug Hom, Kowloo, Hog Kog csbxiao@comp.polyu.edu.hk Qigfeg Zhuge, Yi He, Zili Shao, Edwi

More information

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence?

Recursive Procedures. How can you model the relationship between consecutive terms of a sequence? 6. Recursive Procedures I Sectio 6.1, you used fuctio otatio to write a explicit formula to determie the value of ay term i a Sometimes it is easier to calculate oe term i a sequece usig the previous terms.

More information

Elementary Educational Computer

Elementary Educational Computer Chapter 5 Elemetary Educatioal Computer. Geeral structure of the Elemetary Educatioal Computer (EEC) The EEC coforms to the 5 uits structure defied by vo Neuma's model (.) All uits are preseted i a simplified

More information

Data Warehousing. Paper

Data Warehousing. Paper Data Warehousig Paper 28-25 Implemetig a fiacial balace scorecard o top of SAP R/3, usig CFO Visio as iterface. Ida Carapelle & Sophie De Baets, SOLID Parters, Brussels, Belgium (EUROPE) ABSTRACT Fiacial

More information

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem

An Improved Shuffled Frog-Leaping Algorithm for Knapsack Problem A Improved Shuffled Frog-Leapig Algorithm for Kapsack Problem Zhoufag Li, Ya Zhou, ad Peg Cheg School of Iformatio Sciece ad Egieerig Hea Uiversity of Techology ZhegZhou, Chia lzhf1978@126.com Abstract.

More information

Unsupervised Discretization Using Kernel Density Estimation

Unsupervised Discretization Using Kernel Density Estimation Usupervised Discretizatio Usig Kerel Desity Estimatio Maregle Biba, Floriaa Esposito, Stefao Ferilli, Nicola Di Mauro, Teresa M.A Basile Departmet of Computer Sciece, Uiversity of Bari Via Oraboa 4, 7025

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu

More information

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting)

MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fitting) MATHEMATICAL METHODS OF ANALYSIS AND EXPERIMENTAL DATA PROCESSING (Or Methods of Curve Fittig) I this chapter, we will eamie some methods of aalysis ad data processig; data obtaied as a result of a give

More information

Our Learning Problem, Again

Our Learning Problem, Again Noparametric Desity Estimatio Matthew Stoe CS 520, Sprig 2000 Lecture 6 Our Learig Problem, Agai Use traiig data to estimate ukow probabilities ad probability desity fuctios So far, we have depeded o describig

More information

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article

Journal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article Available olie www.jocpr.com Joural of Chemical ad Pharmaceutical Research, 2013, 5(12):745-749 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 K-meas algorithm i the optimal iitial cetroids based

More information

Performance Comparisons of PSO based Clustering

Performance Comparisons of PSO based Clustering Performace Comparisos of PSO based Clusterig Suresh Chadra Satapathy, 2 Guaidhi Pradha, 3 Sabyasachi Pattai, 4 JVR Murthy, 5 PVGD Prasad Reddy Ail Neeruoda Istitute of Techology ad Scieces, Sagivalas,Vishaapatam

More information

OCR Statistics 1. Working with data. Section 3: Measures of spread

OCR Statistics 1. Working with data. Section 3: Measures of spread Notes ad Eamples OCR Statistics 1 Workig with data Sectio 3: Measures of spread Just as there are several differet measures of cetral tedec (averages), there are a variet of statistical measures of spread.

More information

Lecture 5. Counting Sort / Radix Sort

Lecture 5. Counting Sort / Radix Sort Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018

More information

Study on effective detection method for specific data of large database LI Jin-feng

Study on effective detection method for specific data of large database LI Jin-feng Iteratioal Coferece o Automatio, Mechaical Cotrol ad Computatioal Egieerig (AMCCE 205) Study o effective detectio method for specific data of large database LI Ji-feg (Vocatioal College of DogYig, Shadog

More information

Designing a learning system

Designing a learning system CS 75 Machie Learig Lecture Desigig a learig system Milos Hauskrecht milos@cs.pitt.edu 539 Seott Square, x-5 people.cs.pitt.edu/~milos/courses/cs75/ Admiistrivia No homework assigmet this week Please try

More information

Research on Identification Model of Financial Fraud of Listed Company Based on Data Mining Technology

Research on Identification Model of Financial Fraud of Listed Company Based on Data Mining Technology 208 2d Iteratioal Coferece o Systems, Computig, ad Applicatios (SYSTCA 208) Research o Idetificatio Model of Fiacial Fraud of Listed Compay Based o Data Miig Techology Jiaqi Hu, Xiao Che School of Busiess,

More information

Ch 9.3 Geometric Sequences and Series Lessons

Ch 9.3 Geometric Sequences and Series Lessons Ch 9.3 Geometric Sequeces ad Series Lessos SKILLS OBJECTIVES Recogize a geometric sequece. Fid the geeral, th term of a geometric sequece. Evaluate a fiite geometric series. Evaluate a ifiite geometric

More information

Revisiting the performance of mixtures of software reliability growth models

Revisiting the performance of mixtures of software reliability growth models Revisitig the performace of mixtures of software reliability growth models Peter A. Keiller 1, Charles J. Kim 1, Joh Trimble 1, ad Marlo Mejias 2 1 Departmet of Systems ad Computer Sciece 2 Departmet of

More information

1.2 Binomial Coefficients and Subsets

1.2 Binomial Coefficients and Subsets 1.2. BINOMIAL COEFFICIENTS AND SUBSETS 13 1.2 Biomial Coefficiets ad Subsets 1.2-1 The loop below is part of a program to determie the umber of triagles formed by poits i the plae. for i =1 to for j =

More information

Chapter 3 Classification of FFT Processor Algorithms

Chapter 3 Classification of FFT Processor Algorithms Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As

More information

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 10. Defining Classes. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 10 Defiig Classes Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 10.1 Structures 10.2 Classes 10.3 Abstract Data Types 10.4 Itroductio to Iheritace Copyright 2015 Pearso Educatio,

More information

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets

Criterion in selecting the clustering algorithm in Radial Basis Functional Link Nets WSEAS TRANSACTIONS o SYSTEMS Ag Sau Loog, Og Hog Choo, Low Heg Chi Criterio i selectig the clusterig algorithm i Radial Basis Fuctioal Lik Nets ANG SAU LOONG 1, ONG HONG CHOON 2 & LOW HENG CHIN 3 Departmet

More information

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming

Lecture Notes 6 Introduction to algorithm analysis CSS 501 Data Structures and Object-Oriented Programming Lecture Notes 6 Itroductio to algorithm aalysis CSS 501 Data Structures ad Object-Orieted Programmig Readig for this lecture: Carrao, Chapter 10 To be covered i this lecture: Itroductio to algorithm aalysis

More information

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8)

CIS 121 Data Structures and Algorithms with Java Fall Big-Oh Notation Tuesday, September 5 (Make-up Friday, September 8) CIS 11 Data Structures ad Algorithms with Java Fall 017 Big-Oh Notatio Tuesday, September 5 (Make-up Friday, September 8) Learig Goals Review Big-Oh ad lear big/small omega/theta otatios Practice solvig

More information

Dynamic Programming and Curve Fitting Based Road Boundary Detection

Dynamic Programming and Curve Fitting Based Road Boundary Detection Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk

More information

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance

Pseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured

More information

HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING

HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING Y.K. Patil* Iteratioal Joural of Advaced Research i ISSN: 2278-6244 IT ad Egieerig Impact Factor: 4.54 HADOOP: A NEW APPROACH FOR DOCUMENT CLUSTERING Prof. V.S. Nadedkar** Abstract: Documet clusterig is

More information

Lecture 1: Introduction and Strassen s Algorithm

Lecture 1: Introduction and Strassen s Algorithm 5-750: Graduate Algorithms Jauary 7, 08 Lecture : Itroductio ad Strasse s Algorithm Lecturer: Gary Miller Scribe: Robert Parker Itroductio Machie models I this class, we will primarily use the Radom Access

More information

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics

ENGI 4421 Probability and Statistics Faculty of Engineering and Applied Science Problem Set 1 Descriptive Statistics ENGI 44 Probability ad Statistics Faculty of Egieerig ad Applied Sciece Problem Set Descriptive Statistics. If, i the set of values {,, 3, 4, 5, 6, 7 } a error causes the value 5 to be replaced by 50,

More information

6.854J / J Advanced Algorithms Fall 2008

6.854J / J Advanced Algorithms Fall 2008 MIT OpeCourseWare http://ocw.mit.edu 6.854J / 18.415J Advaced Algorithms Fall 2008 For iformatio about citig these materials or our Terms of Use, visit: http://ocw.mit.edu/terms. 18.415/6.854 Advaced Algorithms

More information

Neural Networks A Model of Boolean Functions

Neural Networks A Model of Boolean Functions Neural Networks A Model of Boolea Fuctios Berd Steibach, Roma Kohut Freiberg Uiversity of Miig ad Techology Istitute of Computer Sciece D-09596 Freiberg, Germay e-mails: steib@iformatik.tu-freiberg.de

More information

A Study on the Performance of Cholesky-Factorization using MPI

A Study on the Performance of Cholesky-Factorization using MPI A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio

More information

Case Studies in the use of ROC Curve Analysis for Sensor-Based Estimates in Human Computer Interaction

Case Studies in the use of ROC Curve Analysis for Sensor-Based Estimates in Human Computer Interaction Case Studies i the use of ROC Curve Aalysis for Sesor-Based Estimates i Huma Computer Iteractio James Fogarty Rya S. Baker Scott E. Hudso Huma Computer Iteractio Istitute Caregie Mello Uiversity Abstract

More information

. Written in factored form it is easy to see that the roots are 2, 2, i,

. Written in factored form it is easy to see that the roots are 2, 2, i, CMPS A Itroductio to Programmig Programmig Assigmet 4 I this assigmet you will write a java program that determies the real roots of a polyomial that lie withi a specified rage. Recall that the roots (or

More information

Appendix D. Controller Implementation

Appendix D. Controller Implementation COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Appedix D Cotroller Implemetatio Cotroller Implemetatios Combiatioal logic (sigle-cycle); Fiite state machie (multi-cycle, pipelied);

More information

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS

FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS FREQUENCY ESTIMATION OF INTERNET PACKET STREAMS WITH LIMITED SPACE: UPPER AND LOWER BOUNDS Prosejit Bose Evagelos Kraakis Pat Mori Yihui Tag School of Computer Sciece, Carleto Uiversity {jit,kraakis,mori,y

More information

Searching a Russian Document Collection Using English, Chinese and Japanese Queries

Searching a Russian Document Collection Using English, Chinese and Japanese Queries Searchig a Russia Documet Collectio Usig Eglish, Chiese ad Japaese Queries Fredric C. Gey (gey@ucdata.berkeley.edu) UC Data Archive & Techical Assistace Uiversity of Califoria, Berkeley, CA 94720 USA ABSTRACT.

More information

Data Structures and Algorithms. Analysis of Algorithms

Data Structures and Algorithms. Analysis of Algorithms Data Structures ad Algorithms Aalysis of Algorithms Outlie Ruig time Pseudo-code Big-oh otatio Big-theta otatio Big-omega otatio Asymptotic algorithm aalysis Aalysis of Algorithms Iput Algorithm Output

More information

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA

Creating Exact Bezier Representations of CST Shapes. David D. Marshall. California Polytechnic State University, San Luis Obispo, CA , USA Creatig Exact Bezier Represetatios of CST Shapes David D. Marshall Califoria Polytechic State Uiversity, Sa Luis Obispo, CA 93407-035, USA The paper presets a method of expressig CST shapes pioeered by

More information

An Efficient Algorithm for Graph Bisection of Triangularizations

An Efficient Algorithm for Graph Bisection of Triangularizations Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe

More information

DETECTION OF LANDSLIDE BLOCK BOUNDARIES BY MEANS OF AN AFFINE COORDINATE TRANSFORMATION

DETECTION OF LANDSLIDE BLOCK BOUNDARIES BY MEANS OF AN AFFINE COORDINATE TRANSFORMATION Proceedigs, 11 th FIG Symposium o Deformatio Measuremets, Satorii, Greece, 2003. DETECTION OF LANDSLIDE BLOCK BOUNDARIES BY MEANS OF AN AFFINE COORDINATE TRANSFORMATION Michaela Haberler, Heribert Kahme

More information

Learning to Shoot a Goal Lecture 8: Learning Models and Skills

Learning to Shoot a Goal Lecture 8: Learning Models and Skills Learig to Shoot a Goal Lecture 8: Learig Models ad Skills How do we acquire skill at shootig goals? CS 344R/393R: Robotics Bejami Kuipers Learig to Shoot a Goal The robot eeds to shoot the ball i the goal.

More information

Evaluation of the Software Industry Competitiveness in Jilin Province Based on Factor Analysis

Evaluation of the Software Industry Competitiveness in Jilin Province Based on Factor Analysis BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 14, No 4 Sofia 2014 Prit ISSN: 1311-9702; Olie ISSN: 1314-4081 DOI: 10.1515/cait-2014-0008 Evaluatio of the Software Idustry

More information

Analysis of Different Similarity Measure Functions and their Impacts on Shared Nearest Neighbor Clustering Approach

Analysis of Different Similarity Measure Functions and their Impacts on Shared Nearest Neighbor Clustering Approach Aalysis of Differet Similarity Measure Fuctios ad their Impacts o Shared Nearest Neighbor Clusterig Approach Ail Kumar Patidar School of IT, Rajiv Gadhi Techical Uiversity, Bhopal (M.P.), Idia Jitedra

More information

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation

Improvement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity

More information

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III

GE FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III GE2112 - FUNDAMENTALS OF COMPUTING AND PROGRAMMING UNIT III PROBLEM SOLVING AND OFFICE APPLICATION SOFTWARE Plaig the Computer Program Purpose Algorithm Flow Charts Pseudocode -Applicatio Software Packages-

More information

Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network

Probabilistic Fuzzy Time Series Method Based on Artificial Neural Network America Joural of Itelliget Systems 206, 6(2): 42-47 DOI: 0.5923/j.ajis.2060602.02 Probabilistic Fuzzy Time Series Method Based o Artificial Neural Network Erol Egrioglu,*, Ere Bas, Cagdas Haka Aladag

More information

Identification of the Swiss Z24 Highway Bridge by Frequency Domain Decomposition Brincker, Rune; Andersen, P.

Identification of the Swiss Z24 Highway Bridge by Frequency Domain Decomposition Brincker, Rune; Andersen, P. Aalborg Uiversitet Idetificatio of the Swiss Z24 Highway Bridge by Frequecy Domai Decompositio Bricker, Rue; Aderse, P. Published i: Proceedigs of IMAC 2 Publicatio date: 22 Documet Versio Publisher's

More information

Speeding-up dynamic programming in sequence alignment

Speeding-up dynamic programming in sequence alignment Departmet of Computer Sciece Aarhus Uiversity Demark Speedig-up dyamic programmig i sequece aligmet Master s Thesis Dug My Hoa - 443 December, Supervisor: Christia Nørgaard Storm Pederse Implemetatio code

More information

Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work

Sectio 4, a prototype project of settig field weight with AHP method is developed ad the experimetal results are aalyzed. Fially, we coclude our work 200 2d Iteratioal Coferece o Iformatio ad Multimedia Techology (ICIMT 200) IPCSIT vol. 42 (202) (202) IACSIT Press, Sigapore DOI: 0.7763/IPCSIT.202.V42.0 Idex Weight Decisio Based o AHP for Iformatio Retrieval

More information

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System

A Novel Feature Extraction Algorithm for Haar Local Binary Pattern Texture Based on Human Vision System A Novel Feature Extractio Algorithm for Haar Local Biary Patter Texture Based o Huma Visio System Liu Tao 1,* 1 Departmet of Electroic Egieerig Shaaxi Eergy Istitute Xiayag, Shaaxi, Chia Abstract The locality

More information

Software Fault Prediction of Unlabeled Program Modules

Software Fault Prediction of Unlabeled Program Modules Software Fault Predictio of Ulabeled Program Modules C. Catal, U. Sevim, ad B. Diri, Member, IAENG Abstract Software metrics ad fault data belogig to a previous software versio are used to build the software

More information

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved.

Chapter 1. Introduction to Computers and C++ Programming. Copyright 2015 Pearson Education, Ltd.. All rights reserved. Chapter 1 Itroductio to Computers ad C++ Programmig Copyright 2015 Pearso Educatio, Ltd.. All rights reserved. Overview 1.1 Computer Systems 1.2 Programmig ad Problem Solvig 1.3 Itroductio to C++ 1.4 Testig

More information

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein

Lecture 6. Lecturer: Ronitt Rubinfeld Scribes: Chen Ziv, Eliav Buchnik, Ophir Arie, Jonathan Gradstein 068.670 Subliear Time Algorithms November, 0 Lecture 6 Lecturer: Roitt Rubifeld Scribes: Che Ziv, Eliav Buchik, Ophir Arie, Joatha Gradstei Lesso overview. Usig the oracle reductio framework for approximatig

More information

(3) The denominator is ineffective for making decisions. Because C is the same for all values.on the other hand due to our independent feature set

(3) The denominator is ineffective for making decisions. Because C is the same for all values.on the other hand due to our independent feature set ORIGINAL ARTICLE Received 17 Ja. 2014 Accepted 30 Ja. 2014 2014, Sciecelie Publicatio www.sciece-lie.com 2322-5114 Joural of World s Electrical Egieerig ad Techology J. World. Elect. Eg. Tech. 3(1): 01-05,

More information

Numerical Methods Lecture 6 - Curve Fitting Techniques

Numerical Methods Lecture 6 - Curve Fitting Techniques Numerical Methods Lecture 6 - Curve Fittig Techiques Topics motivatio iterpolatio liear regressio higher order polyomial form expoetial form Curve fittig - motivatio For root fidig, we used a give fuctio

More information

Lecture 13: Validation

Lecture 13: Validation Lecture 3: Validatio Resampli methods Holdout Cross Validatio Radom Subsampli -Fold Cross-Validatio Leave-oe-out The Bootstrap Bias ad variace estimatio Three-way data partitioi Itroductio to Patter Recoitio

More information

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.

SD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters. SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that

More information