Cache Sharing Management for Performance Fairness in Chip Multiprocessors

Size: px
Start display at page:

Download "Cache Sharing Management for Performance Fairness in Chip Multiprocessors"

Transcription

1 Cache Sharng Management for Performance Farness n Chp Multprocessors Xng Zhou Wenguang Chen Wemn Zheng Dept. of Computer Scence and Technology Tsnghua Unversty, Bejng, Chna zhoux07@mals.tsnghua.edu.cn, {cwg, zwm-dcs}@tsnghua.edu.cn Abstract Resource sharng can cause unfar and unpredctable performance of concurrently executng applcatons n Chp-Multprocessors (CMP). The shared last-level cache s one of the most mportant shared resources because off-chp request latency may take a sgnfcant part of total executon cycles for data ntensve applcatons. Instead of enforcng deal performance farness drectly, pror work addressng farness ssue of cache sharng manly focuses on the farness metrcs of cache mss numbers or mss rates. However, because of the varaton of cache mss penalty, farness on cache mss cannot guarantee deal farness. Cache sharng management whch drectly addresses deal performance farness s nee for CMP systems. Ths paper ntroduces a model to analyze the performance mpact of cache sharng, and proposes a mechansm of cache sharng management to provde performance farness for concurrently executng applcatons. The proposed mechansm montors the actual penalty of all cache msses and uses Auxlary Tag Drectory (ATD) to dynamcally estmate the cache msses wth cated cache when the applcatons are actually runnng wth shared cache. The estmated relatve slowdown for each core from cated envronment to shared envronment s used to gude cache sharng n order to guarantee performance farness. The experment results show that the proposed mechansm always mproves the performance farness metrc, and can provde no worse throughput than the scenaro wthout any management mechansm. 1. Introducton Data ntensve applcatons usually spend a large proporton of total executon cycles on memory accessng because of the long latency of off-chp requests. The on-chp last level cache (usually L2 or L3 cache), whch s the key component to hde off-chp request latency, s typcally shared by multple cores n a Chp-Multprocessors(CMP) archtecture. As a result, the sharng of last level cache has sgnfcant mpacts on the performance of concurrently executng applcatons on dfferent cores. The technologes of server consoldaton and vrtual machne are callng for the demand of schedulng heterogeneous workloads together. The dfferent memory accessng characterstcs of heterogeneous workloads wll lead to unfar cache sharng, whch breaks the hardware farness assumpton of the scheduler and may brng thread starvaton, prorty nverson and other problems to operatng system s process scheduler [6]. Pror work has notced the problem of unfar cache sharng and the consequental result of unfar performance. [6] proposed a cache parttonng mechansm to enhance farness of cache sharng. Although ntutvely the deal farness of cache sharng should be equal slowdowns relatve to runnng wth a cated cache for all co-scheduled applcatons, [6] uses the equal ncrement of cache mss numbers or mss rates as farness metrcs, because [6] ponts out that deal farness s hard to measure and cache mss farness are usually hghly correlates wth the deal farness. The mechansm proposed n [6] s an mprovement over unmanaged caches. However, because the cache mss farness s not dentcal to deal farness, the mprovement on cache mss farness can not guarantee the same degree of deal farness mprovement, and enforcng farness on cache mss does not necessarly lead to the stuaton of deal farness. To explan the above results, we should note that the correlaton between cache mss farness and deal farness (performance) actually depends on two factors: (1) The performance senstvty of each applcaton to cache msses vares, as applcatons spend dfferng fractons of executon tme stalled on cache msses. The performance of those applcatons wth a large fracton of executon cycles stalled on msses wll be more senstve n cache mss vares. Ths fracton are dverse; for example, data centrc applcatons wll spend much more cycles on memory access (stalled by cache msses) than computaton orented applcatons. (2) The stalls arsng from each mss vary as a functon of Memory Level Parallelsm (MLP) [13][2] and ILP. Clustered cache msses latency cycles overlapped wth recent

2 msses, and the average penalty of each cache mss can be reduced. So the actual penalty of each cache mss s smaller than memory access latency and may vary accordng to dfferent memory access characterstcs. Besdes, computaton operaton cycles and memory access latency cycles may overlap, whch can also reduce the actual penalty of cache msses. Thus, n order to enforcng deal farness on cache sharng, we must frstly buld an analytc model to account for the performance mpact of cache sharng. The analytc model should be able to help to dvde applcatons whole executon cycles nto prvate part, whch s not related to cache sharng, and vulnerable part, whch s susceptble to addtonal cache msses caused by cache sharng. Then, accordng to the analytc performance model we can develop a mechansm to provde a reference pont for deal farness metrc by measurng or estmatng the applcaton s performance n a cated cache when t s actually runnng n a shared cache. Ths paper makes two contrbutons: (1) we proposes a model to analyze the performance mpact of cache sharng for CMP, consderng not only addtonal cache msses caused by cache nterference of concurrent workloads, but also the varaton of the actual penalty for each cache mss. (2) To the best of our knowledge, ths paper proposes the frst mechansm that parttons shared cache and enforces farness for the goal of deal performance farness. Ths mechansm s dynamc and adaptve, no statc profle nee. The proposed mechansm always mproves the performance farness metrc, and can provde no worse throughput than the cache wthout any management mechansm. The rest of the paper s organzed as follows. Secton 2 gves a defnton of deal farness as well as cache mss farness whch s used n pror work. Secton 3 ntroduces an accurate model, whch connects cache mss rate and overall performance, to estmated the performance mpact of cache sharng. In Secton 4, the hardware mechansm of enforcng farness on shared cache n a typcal CMP archtecture s descrbed n detal. Secton 5 descrbes the experment methodology and dscusses experment results. Secton 6 ntroduces related work brefly and Secton 7 gves a concluson. 2. Defnng Performance Farness We defne performance farness, whch s the deal farness dscussed above, as dentcal slowdown for each concurrent workload runnng wth shared cache compared to runnng wth separate, cated caches, whch s called executon tme farness n [6]. Let T denotes the executon tme (count of executon cycles) of workload wth cated cache and T for the executon tme wth shared cache, then performance farness s acheved when: T T = T j Tj for every par of concurrent workloads and j. T and T can be replaced by CPI and CPI for a certan slce of runnng nstructons, and then Equaton 1 can be transformed to: CPI CPI = CPI j CPIj Actually CPI s the slowdown of workload runnng CPI wth shared cache compared to cated cache. And we defne performance farness metrc of a par of concurrently runnng workloads and j under a certan cache partton as M perf : M perf = j CPI CPI CPI j CPIj Intutvely, M perf s the sum of the slowdown dfference between every two co-scheduled workloads. The smaller M perf s, the smaller the slowdown dfference among all coscheduled workloads, thus the better the performance farness. If M perf equals zero, perfect performance farness s acheved. For the purpose of comparson, we also defne that cache mss farness s acheved when: MPKI MPKI = MPKI j MPKI j for every par of concurrent workloads and j, n whch MPKI and MPKI denote the mss count per thousand nstructons when workload runnng wth cated cache and shared cache. Cache mss farness metrc s defned as: M mss = j MPKI MPKI MPKI j MPKI j Smlar to M perf, the smaller M mss s, the better the cache mss farness, and perfect cache mss farness s acheved when M mss equals zero. M mss s the same as M 3 n [6] and FM3 n [7]. [6] proposed fve farness metrcs, but M 2 has been shown to have a poor correlaton wth performance farness; M 1 and M 3 contrbute the most hghly correlaton wth performance farness for most cases. If the workload runs the same count of nstructons wth cated cache and shared cache, M mss s the same as M 1, too. So M mss s representatve enough. (1) (2) (3) (4) (5)

3 P A B 3. Modelng Performance Impact of Cache Sharng We use a typcal CMP archtecture confguraton for followng analyss: N cores on chp; each core has prvate L1 nstructon cache and data cache; unfed on-chp L2 cache shared by all on-chp cores s the last level cache, and a mss n L2 cache wll ssue an off-chp request. All caches are set-assocated. To model performance mpact of cache sharng, the cycles consumed durng applcaton s runnng tme can be categorzed nto two classes: prvate operaton cycles (T pr ) and vulnerable operaton cycles (T vul ). Intutvely, prvate operaton cycles are the part of executon cycles whch only depends on the characterstcs of the workload. Vulnerable operaton cycles are senstve to dfferent co-schedulers on other cores and may vares dversely because of resource sharng. Prvate operaton cycles are consumed by those operatons whch only need prvate resources of the core, such as computaton tme, the latency of fetchng nstructon and data from prvate L1 cache. A L2 request s a hybrd operaton. The tag lookng up latency s accounted n prvate operaton cycles for t s constant no matter whether ths request msses or not. However, f the L2 cache msses, the cycles of off-chp request latency are vulnerable operaton cycles because ths cache mss may be caused by cache sharng and the off-chp request s extra cycles. However, the whole executon cycles s not smply equal to the sum of T pr and T vul because there are overlap cycles of prvate operatons and vulnerable operatons. For example, n an out-of-order processor, even f a nstructon s stalled by a L2 cache mss, other nstructons n the schedule wndows stll can enter ppelne f there s no data dependence. As a result, the overlap cycles (T ovl ) must be ntroduced, then: T = T pr + T vul T ovl (6) The example n Fgure 1 llustrates Equaton 6. p e l n e L 1 M s s L 2 M s s Fgure 1. Illustraton of Equaton 6. The horzontal axs represents tme (cycles) elapsed. In ths example, durng the executon tme of T, two L2 cache msses occur. After the frst mss, the nstructon wndow stll has nstructons wth no data dependence on ths mss, so the ppelne s not stalled yet untl the second mss occurs. The computaton tme of ppelne s certanly part of T pr, and the off-chp request latency of L2 msses s T vul. Note that when L1 msses occur, the L1 request latency (tag lookng up latency n L2 cache), such as slce A and slce B n the fgure, s part of T pr ; however, t ends up as an L2 mss, the latency of off-chp request does not belong to T pr any more. T ovl s formed by those cycles when T pr and T vul overlap. Because the vulnerable operaton cycles are manly the cycles of off-chp request latency, T vul can be approxmated by the total penalty of off-chp requests. Average MLP s nee to consder to estmate the average penalty of each cache mss. We derve the average MLP defnton n [2] as the average number of useful outstandng off-chp requests when there s at least one outstandng off-chp requests, denoted by MLP avg. We have the followng equaton: T vul = N mss =1 = N mss Then Equaton 6 can be: T = T pr T ovl + N mss Mss Penalty Mss Latency MLP avg (7) Mss Latency MLP avg (8) In Equaton 8 the latency of off-chp request (Mss Latency) can be treated as a constant (gnorng other factors such as bus congeston). Equaton 6 and 8 shows that the total executon tme of an applcaton ncludng three parts: T pr s the nhert part that does not change for cache sharng. T ovl and T vul are related to cache sharng; they are the cause of unfar performance and the part of executon tme that cache partton mechansms want to adjust. Accordng Equaton 8, f we can get the parameters of T ovl, N mss and MLP avg through hardware profler, the total executon tme can be estmated dynamcally. 4 Hardware Mechansm The necessary hardware support ncludes two nterdependent parts: cache partton mechansm and hardware profler. Fgure 2 shows how t works. The hardware profler gather the statstcs of each core s ppelne, each prvate L1 nstructon cache and data cache, and unfed L2 cache durng last perod. At the end of perod, cache partton decson s made and the cache partton mechansm apples the dec partton to cache.

4 P r o f l e r C o r e 1 C o r e N L 1 D / L 1 I L 1 D / L 1 I U n f e d L 2 C a c h e C a c h e P a r t t o n M e c h a n s m Fgure 2. Hardware Framework P r o f l e r downs compared to runnng wth cated cache s the deal goal. The key problem s to estmate the cycles nee by the workload wth cated cache when t s actually runnng wth a shared cache. T and CPI and T or CPI for each core are need to evaluate M perf defned n Equaton 3. The job of hardware profler s to profle necessary runtme or statstcal parameters when the applcatons are executng concurrently, and use Equaton 6 and Equaton 8 to dynamcally estmate the stuaton f runnng wth cated cache. Accordng to Equaton 6 and Equaton 8: 4.1 Cache Partton Mechansm The cache partton mechansm s the smpler part. Assumng there are N cores on chp sharng L2 cache, we add log 2 N bts for each cache block to mark whch core ths cache block belongs to. And each core has a bt of overalloc flag to ndcate whether ths core has been allocated too much cache space or not. When a cache mss from core occurred, frstly found the correct cache set; f the overalloc flag of core shows that core s not over allocated, use orgnal cache replacement polcy to fnd a vctm cache block, and change the mark bts of ths cache block as belongng to core ; f the overalloc flag shows that core has been allocated too much cache space, randomly select one cache block wth mark of core as vctm cache block. Ths mechansm guarantees over allocated cores can not gan more cache lnes, and the addtonal cache block wll be gradually taken by other under allocated cores, and the partton target set by overalloc flag of each cores. A specal stuaton s that though the regster of core shows over allocated, there s no cache block marked as belongng to core, we stll allocate a cache lne for core.! " # Fgure 3. Cache Partton Mechansm 4.2 Hardware Profler The hardware profler s more complex. To acheve performance farness for a shared cache CMP, dentcal slow- T T = T pr T ovl + T vul (9) T = Tpr Tovl + Nmss Mss Latency MLPavg Accordng to the defnton of T pr we can get T (10) pr = pr = T pr. So n order to enforce performance farness, T, Tovl and Tvul should be profled n shared mode, and then T pr can be calculated; at the same tme, Tovl, Nmss and MLP avg need to be estmated, and fnally we gan T and be able to evaluate M perf. The data flow for analyss s showed n Fgure 4. T, Tovl, T vul T pr ց ց T M perf T ր Tovl, N mss, MLP avg Fgure 4. Data Flow for Analyss If we only want to enforce cache mss farness defned by Equaton 4, thngs are much smpler: only N N mss and mss are nee, then M mss defned by Equaton 5 can be evaluated Proflng parameters for shared cache The profler calculate statstcs n every P T cycles (proflng perod), then T = PT. To get Tvul, we add a log 2N bts flag to each entry of L2 MSHR (Mss Status Holdng Regster) to dentfy whch core causes ths mss; then montor the count of entres wth the specfed flag to see f there s request of ths core n MSHR. At the begnnng of proflng perod, Tvul s ntalzed to zero. If MSHR has at least one entry for ths core, Tvul should be ncreased because there s at least one off-chp request on ths cycle. To decde whether a cycle of the whole executon tme belongs to Tovl, three condtons should be consdered: (1) whether the ppelne s stalled n ths cycle; (2) whether there s at least an L1 request n ths cycle; (3) whether there

5 Algorthm 1: Calculatng T vul /* at the begnnng of each profler perod */ T vul = 0; /* for each cycle*/ f L2 MSHR has entres for ths core then T vul + +; s at least an L2 request for ths core n ths cycle. Condton (2) only means the on-chp request; f the L1 request ends up to be an L2 mss, the off-chp latency cycles s not condton (2) (referrng the example showed n Fgure 1). Condton (1) can be easly got by montorng the status of ppelne and condton (3) by montorng the whether the L2 MSHR has entres for ths core. To decde condton (2), we add a tmer for each L1 cache, ncludng nstructon cache and data cache. Let L2 LAT denotes the lookup latency of L2 (L2 LAT s always a fxed value). If ths L1 cache msses and ssues a request to L2, the tmer s set to L2 LAT. For each cycle, the tmer s decreased by 1. The condton (2) can be dec by checkng whether the tmer s zero or not. Algorthm 2 descrbed the logc n detal Proflng and estmatng parameters for cated cache The three parameters of Tovl, N mss and MLP avg shown n Fgure 4 are responsble to the applcaton runnng wth a cated cache. However, because the workload s actually runnng wth a shared cache, we need specal hardware to help to estmate the stuaton when t s runnng wth a cated cache. To acheve ths, we uses the technology of Auxlary Tag Drectory (ATD) [13] to attach a vrtual prvate L2 cache to each core. An ATD has the same assocatvty as the man tag drectory of the shared L2 cache and uses the same replacement polcy. However, an ATD only contans the tag and other functonal bts of each cache block but do not keep data. We also add an auxlary MSHR to each ATD, then an ATD can act just as a prvate L2 cache. Each entry n the auxlary MSHR has a tmer to smulate memory accessng request. When a memory request s ssued to an auxlary MSHR, the tmer of the nserted entry s set to the round trp cycles of memory accessng latency. For every cycle, all entres tmers are automatcally decreased by 1, A tmer s value equals 0 means ths memory request has returned and the request block s ready for ATD. Because there s an ATD for each core, the total hardware cost s substantal. We employ set samplng technology [13] to reduce the storage cost. The ATD wth set samplng only selects cache set samples n a specfed nterval, and the be- Algorthm 2: Calculatng T ovl /* at the begnnng of each profler perod */ T ovl = 0; /* for each cycle; for L1I and L1D of ths core*/ f L1 mss occurs at ths cycle then tmer of ths L1=L2 LAT ; else tmer of ths L1 ; /* for each cycle*/ f ppelne s not stalled /*(1)*/ then ppelnenotstall=true; else ppelnenotstall=false; f L2 MSHR has entres for ths core /*(3)*/ then hasl2request=tru else hasl2request=false; f L1I or L1D tmer s not zero /*(2)*/ then hasl1request=true; else hasl1request=false; f ( ppelnenotstall or hasl1request) and hasl2request then T ovl + +; havor of the whole can be approxmated by sampled cache sets. An ATD wth large samplng nterval requres less hardware overhead, but gves less accurate statstcs. So, there s a tradeoff between hardware cost and accuracy n choosng samplng nterval. Wth an ATD and an auxlary MSHR, t s possble to smulate a prvate L2 cache for each core. When a request from upper level L1 cache s comng, t s forwar to the really L2 cache as well as the attached ATD of the core whch the request sender (L1 cache) belongs to. Then the ATD respond to ths request just as L2 cache does, ncludng touchng MRU bts, replacement, countng mss count and ssue requests to the auxlary MSHR. So Nmss and Tovl can be ganed by the ATD and auxlary MSHR attached to each core usng smlar mechansm whch s used to get and T N mss MLP avg ovl. can also be ganed wth the help of auxlary MSHR. MLP avg s defned as the average number of useful outstandng off-chp requests when there s at least one outstandng off-chp requests. Two counter M LP sum and MLP cycles are used to store the accumulated off-chp latency and the count of cycles when there s at least one outstandng off-chp request. The algorthm s descrbed as follows: Note that Tovl and MLPavg are both estmated values whle we can treat Nmss as accurate value f set samplng s not employed. Because L2 cache msses have mpact on the ppelne, the relatve order of L2 cache msses may be

6 Algorthm 3: Calculatng MLP avg /* at the begnnng of each profler perod */ MLP sum = 0; MLP cycles = 0; /* for each cycle*/ f auxlary MSHR s not empty then MLP cycles + +; M LP sum+ =count of auxlary MSHR entres; /* at the end of each profler perod */ MLPavg MLP sum = MLP cycles ; dfferent between runnng wth shared cache and wth cated cache, whch makes the mechansm for calculatng Tovl and MLPavg naccurate compared to really runnng the applcaton wth a cated cache. However, f the applcaton s actually runnng wth cated cache, the same algorthm can used to obtan the accurate MLP by replacng auxlary MSHR wth man L2 MSHR. Now the hardware profler can provde all the parameters nee to calculate T and T for each core. The slowdown for core can be calculated at the end of proflng perod by: Slowdown = T T (11) The core whch has the least slowdown s selected as beng allocated too much cache space by settng the ths core s overalloc flag. Through the cache partton mechansm, the slowdown of each core wll be closer and performance farness wll be mproved. 4.3 Hardware cost For each cache lne, log 2 N bts are ad to ndcate whch core ths cache lne belongs to, n whch N s the number of cores sharng the cache. In addton, there should be some montor crcuts n L2 MSHR, L1 cache and ppelne. For each core, there are 2 addtonal L1 tmers and 1 auxlary MSHR, typcally 32-entry. ATDs are the major part of cost. Set samplng can sgnfcantly reduce the storage of ATD. For the confguraton of 2-way CMP wth 1M, 64 bytes lne sze, 8-way assocatve L2 cache, and samplng for each 16 cache sets, the hardware cost s: For each ATD entry: (24 bts tag)+(4 bts LRU)+(1bt vald)+(2 bts n-addton)=31 bts Number of ATD entres: (1MB L2)/(64 Bytes block)/(16 cache sets samplng)=1024 Total cost for 1 ATD: 1024*31bt<4KB. The orgnal L2 cost s: 1MB+(24+4+1)*(1M/64B)=1.45MB. So the addtonal cost for 2 ATDs of the 2 cores s less than 1%: (2 4KB)/1.45MB = < 1% 5 Expermental Methodology 5.1 Confguraton The evaluaton s performad usng Smcs [8], whch s a whole system smulator supportng CMPs. The memory tmng model s derved from GEMS [9], whch enables detaled cycle-accurate smulaton of multprocessor systems for Smcs. Table 1 shows the basc parameters of the smulated archtecture. The smulated CMP cores are out-of-order superscalar processors wth prvate L1 nstructon and data caches, sharng unfed L2 cache and all lower levels of memory herarchy. All caches are set-assocated usng Pseudo LRU polcy for replacement decson. CMP 2 cores on chp, share on-chp L2 cache Processor 4-ssue out-of-order processors core nstructon wndow sze: 64 Re-order buffer sze: 128 prvate Icache and Dcache for each core L1 cache Icache: 32KB, 64B lne-sze, 4-way PLRU Dcache: 32KB, 64B lne-sze, 4-way PLRU Unfed 1MB, 64B lne-sze, 8way PLRU shared 12-cycle ht, 32-entry MSHR L2 cache shared by all cores on chp Memory 400-cycle round trp latency Table 1. Basc confguraton of the smulated archtecture 5.2 Metrcs To evaluate the farness mprovement of dfferent schemes at the baselne of uncontrolled L2 cache sharng (orgnal Pseudo LRU polcy), we defne: F scheme perf F scheme mss = Mscheme perf Mperf PLRU = Mscheme mss Mmss PLRU (12) (13) Mperf scheme and Mperf PLRU are performance farness metrcs defne by Equaton 3; Mmss scheme and Mmss PLRU are cache

7 mss farness metrcs defne by Equaton 5. Obvously Fperf PLRU = 1 and Fmss PLRU = 1. Those metrcs are normalzed farness metrcs. The smaller the value, the better the farness of the scheme. Zero means perfect farness. If the baselne already acheves perfect farness, Fperf scheme and Fmss scheme would be nfnte unless the cache partton scheme also acheves perfect farness. 5.3 Benchmarks We select a set of most memory-ntensve benchmarks from SPEC CPU 2000 benchmark sute, and run them concurrently n pars n the smulated dual-core CMP system to see the beneft of dfferent cache partton schemes, ncludng uncontrolled L2 cache sharng usng orgnal Pseudo LRU polcy, enforcng cache mss farness on cache parttonng and enforcng performance farness on cache parttonng. We compare the normalzed cache mss farness metrc (Fmss scheme ) and normalzed performance farness metrc (Fperf scheme ) for all cache partton schemes at the baselne of uncontrolled L2 cache sharng. For each selected benchmark, a representatve slce of nstructons s obtaned usng a tool SmPont [10] for smulaton. To categorze the selected benchmarks, we consder two features of each benchmark that can affect the correlaton between cache mss farness and performance farness: (1) the porton whch vulnerable tme took n the whole executon tme, whch s T vul /T accordng to Equaton 6; (2) the average MLP durng executon of the benchmark. Workloads wth hgh value of T vul /T are more senstve to nterleavng, whle workloads wth hgh average MLP are more tolerant for last level cache msses. These two factors must be combned n analyss. Hgh T vul /T Low T vul /T Hgh MLP mcf(55.2%, 1.53) ammp art(56.5%, 1.82) (29.2%, 1.47) applu(72.3%, 1,21) gzp(24.1%, 1.26) Low MLP swm(63.3%, 1.10) aps(28.5%, 1.08) equake(47.6%,1.19) vpr(28.3%, 1.28) sxtrack(39.2%,1.09) Table 2. Benchmark classfcaton Table 2 shows the classfcaton of the ten selected benchmarks based on the values of T vul /T and average MLP: mcf, ammp, art, applu, gzp, swm, aps, equake, vpr and sxtrack. The par of numbers shown n parenthess denotes the benchmark s values of T vul /T and average MLP. These data are collected by runnng a sngle benchmark wth cated cache of orgnal Pseudo-LRU replacement polcy. The eght benchmarks are categorzed nto four classes: hgh-vulnerablty/hgh- MLP, hgh-vulnerablty/low-mlp, low-vulnerablty/hgh- MLP and low-vulnerablty/low-mlp. We par the benchmarks presented above and runnng them concurrently n our smulated system wth shared cache to evaluate farness metrcs of dfferent schemes. Table 3 shows the classfcaton parameters when the benchmarks are runnng concurrently wth shared cache of orgnal replacement polcy. We can see that although the two parameters may vary dversely (especally the porton of cache mss latency), the relatonshp of parameter values are kept. That s, f a benchmark from the class of hgh/hgh s runnng concurrently wth another benchmark from the class of low/low, the frst benchmark usually stll has a larger porton of cache mss latency and a larger MLP. So s other combnatons of benchmarks from the rest classes. T vul /T T vul /T MLP MLP App1 App2 App1 App2 gzp+mcf 32.0% 82.7% gzp+art 26.7% 62.2% aps+art 36.1% 61.9% swm+aps 87.1% 31.1% vpr+applu 31.2% 78.1% ammp+applu 44.0% 86.7% mcf+swm 83.8% 67.8% swm+art 85.0% 73.6% equake+mcf 55.0% 68.8% ammp+sxtrack 40.0% 41.6% equake+sxtrack 64.6% 48.2% Table 3. Benchmark pars and parameters when runnng concurrently. 5.4 Results and Analyss Model Accuracy Verfcaton Before evaluatng farness metrcs, we need to verfy how accurate the model descrbed n above sectons could be. In ths verfcaton experment, we ran benchmarks for two passes. For the frst pass, we used the hardware confguraton n Table 1, but only appled profler component of the hardware mechansm to the shared L2 cache; the cache replacement polcy s not modfed. We ran the benchmark pars n Table 3 n the share cache confguraton and get necessary statstcs for estmatng the benchmarks performance (IPC ) when runnng wth cated cache. In the second pass, we ran these benchmarks agan n dentcal hardware except usng cated L2 cache for each core (the cated cache has the same confguratons as the shared

8 5.4.2 Evaluaton of farness metrcs We use Fperf scheme and Fmss scheme descrbed n Secton 4.2 to measure the farness mprovement for two cache partton schemes: the scheme that enforces cache mss farness (SECF) and the scheme that enforces performance farness (SEPF). The baselne s uncontrolled cache sharng usng orgnally Pseudo LRU polcy, whch s wdely used n producton processors. (a) Cache mss farness metrc (F scheme mss ) Fgure 5. Verfyng accuracy of the model. Y axs represents the relatve error (E defned n Equaton 14). X axs represents the nterval of the set samplng n ATD; nterval 1 means do not use set samplng (sample every tag set n ATD). cache). In ths pass of runnng, we can get the the benchmarks actual (IP C) performance when runnng wth cated cache. Then we can compare IPC and IPC, and get the relatve error: E = 1 IPC IPC (14) To revew the effect of employng set samplng technology on ATD, we compared the estmatng error of dfferent samplng nterval. From Fgure 5 we can see that, f do not use set samplng n ATD (samplng nterval 1), the average E s 4.7% and max E s 10.9% (equake runnng wth sxtrack). As the samplng nterval ncreasng, E ncreases as well. When samplng ATD set wth nterval 16, the average E s 6.7% and max E s 13.3% (art runnng wth gzp), whch s stll accurate enough. So the followng experments used nterval 16 for set samplng n ATD. (b) Performance farness metrc (F scheme perf ) Fgure 6. Comparng farness metrcs of selected benchmark pars wth uncontrolled Pseudo LRU and the two cache partton schemes. Note that shorter bar means better farness Fgure 6 shows the experment results of cache mss farness metrc (Fmss scheme ) and performance farness metrc (Fperf scheme ). For each benchmark par, the two benchmarks are executed concurrently for three passes: n the frst pass, shared L2 cache uses orgnal, uncontrolled Pseudo LRU replacement polcy; n the second and thrd pass, cache partton schemes of enforcng cache mss farness (SECF) and enforcng performance farness (SEPF) are used. The two farness metrcs of (Fmss scheme ) and (Fperf scheme ) are measured for each pass of executon. In Fgure 6 (a), we can see that when usng SECF to en-

9 Fgure 7. Comparng throughput of selected benchmark pars wth uncontrolled Pseudo LRU and the two cache partton schemes. Note that taller bar means hgher throughput. force cache mss farness on shared cache, cache mss farness metrc (Fmss scheme ) s mproved sgnfcantly compared to uncontrolled Pseudo LRU replacement polcy (note that shorter bar means better farness). However, when usng SEPF to enforce performance mss farness on shared cache, though there s stll cache mss farness metrc (Fmss scheme ) mprovement compared to uncontrolled cache for most benchmarks, Fmss scheme s not mproved as much as SECF. Even there are benchmark pars whch show worse cache mss farness than uncontrolled Pseudo LRU cache when usng SEPF(ammp+equake and equake+mcf ). On average, SECF gans a cache mss farness mprovement of Fmss scheme = 0.14, whle SEPF gans an mprovement of Fmss scheme = Fgure 6 (b) shows the performance farness metrc (Fperf scheme ) when usng dfferent cache partton schemes. The result s dfferent from the result of cache mss farness metrc. Although SECF always shows better Fmss scheme than the SEPF, t fals to gan a better performance farness metrc (Fperf scheme ) than SEPF. Whats more, there are three benchmark pars suffer great performance farness degradaton when usng SECF to enforce cache mss farness: swm+art (3.87), equake+mcf (18.3) and equake+sxtrack(3.32), whch means that SECF may make the problem of unfar performance of concurrent workloads even worse n some cases. In contrast, although SEPF got poor Fmss scheme for ammp+equake and equake+mcf, t ends up that SEPF gans better performance farness n these benchmark pars compared to SECF. On average, SECF gans a performance farness mprovement of Fperf scheme = 1.04, whle SEPF gans an mprovement of Fperf scheme = From the comparson of the results showed n Fgure 6 (a) and Fgure 6 (b), we can get a deeper nsght nto the correlaton between cache mss farness and performance farness. Merely enforcng cache mss farness on concurrent workloads can mprove performance farness n most cases, but can not guarantee deal performance farness and sometmes may suffer performance farness nstead of mprovng t, especally when the co-scheduled workloads have dfferent features. For example, when benchmarks of type 4 (lowvulnerablty/low-mlp) are runnng wth benchmarks wth type 1 (hgh-vulnerablty/hgh-mlp) such as benchmark pars of gzp+mcf, gzp+art and aps+art, SECF gans a much poorer performance farness than SEPF; when benchmarks of type 2 (hgh-vulnerablty/low-mlp) are runnng wth benchmarks wth type 1(hgh-vulnerablty/hgh-MLP) such as benchmark pars of swm+art, equake+mcf and equake+sxtrack, SECF even suffers performance farness a lot. The reason s that when a low-mlp workload s runnng concurrently wth another hgh-mlp workload, more shared cache resource should be allocated to low-mlp workload to guarantee the whole performance farness because low-mlp workload has a relatvely larger cache mss penalty and enforcng cache mss farness can not guarantee performance farness. And f both of the concurrently runnng workloads are hghly senstve to cache nterference (type 1 benchmarks runnng wth type 2 benchmarks), the degradaton of performance wll be more obvous when usng SECF to enforcng cache mss farness. By takng more factors nto account, SEPF can acheve better performance farness n most cases. Fgure 7 shows the throughput of selected benchmark pars wth uncontrolled Pseudo LRU and the two cache partton schemes. We use far speedup defned n Equaton 14 to measure the throughput of concurrent runnng benchmark pars. Note that taller bar n the fgure means hgher throughput. We can see that SEPF gans a compettve throughput for most benchmark pars. And for gzp+mcf, ammp+applu and swm+mcf, SEPF can provder hgher throughput than orgnal Pseudo LRU and SECF. On average, SECF gans a far speedup of 1.04 whle SEPF gans a far speedup of 1.07 on the base lne of orgnal Pseudo LRU replacement polcy. 6 Related Work Pror work has notced the mpact of cache sharng for concurrently runnng threads. [15] proposed to use hardware counters to estmate the cache mss-rate as a functon of cache sze, whch can be used to optmze cache partton to mnmum overall mss rate. [13] desgned a runtme mechansm that parttons a shared cache accordng to the cache utlty of concurrent runnng multple applcatons. [15] and [13] both focused on optmzaton of overall mss rate. [6] ponted out the necessty of enforce farness for co-scheduled workloads. [6] defned a set of farness metrc for cache sharng and proposed both statc strategy and

10 dynamc mechansm to mprove farness. [4] proposed a framework to provde QoS for resources ncludng shared caches. [14] desgned archtectural support for OS to manage shared caches. Some researches ndcates that decreasng n cache mss rate does not necessarly lead to performance mprovement. [12] ndcated that not every cache mss has an equal penalty because of the exstng of MLP. By takng MLP related cost of each cache mss nto account, [12] modfed the standard LRU replacement polcy and hgher performance guaranteed. [2] analyzed the mcroarchtecture mpact on MLP and developed a detaled model to relatng MLP to overall performance. There are researches of performance models for other archtectures. [3] proposed a cycle accountng archtecture for SMT processors to estmate the performance of each coscheduled thread had they ran alone. [5] proposed a performance model for superscalar processors. 7 Concluson Uncontrolled sharng usually leads to unfar performance of concurrent workloads. That s, some workloads suffer a much more sgnfcant slowdown than other workloads. Ths phenomenon brngs more problems such as prorty nverson and thread starvaton to operatng system s process scheduler. Instead of enforcng deal performance farness drectly, pror work addressng farness ssue of cache sharng manly focuses on the farness metrcs of cache mss numbers or mss rates. However, because of the varaton of cache mss penalty, farness on cache mss cannot guarantee deal farness. Cache sharng management whch drectly addresses deal performance farness s nee for CMP systems. Ths paper proposed the concept of performance farness metrc. We bult a detaled model to analyze the performance mpact of cache sharng. Gu by ths model, we desgned a hardware mechansm to enforcng performance farness on shared cache. The hardware mechansm proposed n ths paper s adaptve and hardware effcent. For comparson, the concept of cache mss farness metrc and a hardware mechansm to enforcng cache mss farness are also ntroduced. We mplemented these two cache partton schemes n a smulator. The experment results showed that the proposed mechansm always mproves the performance farness metrc, and can provde no worse throughput than the scenaro wthout any management mechansm. References [1] J. Chang and G. S. Soh. Cooperatve cache parttonng for chp multprocessors. In ICS 07: Proceedngs of the 21st annual nternatonal conference on Supercomputng, pages , New York, NY, USA, ACM. [2] Y. Chou, B. Fahs, and S. G. Abraham. Mcroarchtecture optmzatons for explotng memory-level parallelsm. In ISCA, pages 76 89, [3] S. Eyerman and L. Eeckhout. Per-thread cycle accountng n smt processors. In ASPLOS 09: Proceedng of the 14th nternatonal conference on Archtectural support for programmng languages and operatng systems, pages , New York, NY, USA, ACM. [4] F. Guo, Y. Solhn, L. Zhao, and R. Iyer. A framework for provdng qualty of servce n chp mult-processors. In MI- CRO, pages , [5] T. S. Karkhans and J. E. Smth. A frst-order superscalar processor model. In ISCA 04: Proceedngs of the 31st annual nternatonal symposum on Computer archtecture, page 338, Washngton, DC, USA, IEEE Computer Socety. [6] S. Km, D. Ch, and Y. Solhn. Far cache sharng and parttonng n a chp multprocessor archtecture. In IEEE PACT, pages , [7] J. Ln, Q. Lu, X. Dng, Z. Zhang, and X. Zhang. Ganng nsghts nto mult-core cache parttonng: Brdgng the gap between smulaton and real systems. In In HPCA 08: Proceedngs of the 14th Internatonal Symposum on Hgh- Performance Computer Archtecture, [8] P. S. Magnusson, M. Chrstensson, J. Esklson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, B. Werner, and B. Werner. Smcs: A full system smulaton platform. Computer, 35(2):50 58, [9] M. M. K. Martn, D. J. Sorn, B. M. Beckmann, M. R. Marty, M. Xu, A. R. Alameldeen, K. E. Moore, M. D. Hll, and D. A. Wood. Multfacets general executon-drven multprocessor smulator (gems) toolset. SIGARCH Comput. Archt. News, 33:2005, [10] E. Perelman, G. Hamerly, M. V. Besbrouck, T. Sherwood, and B. Calder. Usng smpont for accurate and effcent smulaton. In ACM SIGMETRICS Performance Evaluaton Revew, pages , [11] M. Quresh. Adaptve spll-receve for robust hghperformance cachng n cmps. In Hgh Performance Computer Archtecture, HPCA IEEE 15th Internatonal Symposum on, pages 45 54, Feb [12] M. K. Quresh, D. N. Lynch, O. Mutlu, and Y. N. Patt. A case for mlp-aware cache replacement. In ISCA-33, pages , [13] M. K. Quresh and Y. N. Patt. Utlty-based cache parttonng: A low-overhead, hgh-performance, runtme mechansm to partton shared caches. In MICRO-39, pages , [14] N. Rafque, W.-T. Lm, and M. Thottethod. Archtectural support for operatng system-drven cmp cache management. In PACT 06: Proceedngs of the 15th nternatonal conference on Parallel archtectures and complaton technques, pages 2 12, New York, NY, USA, ACM. [15] G. E. Suh, S. Devadas, and L. Rudolph. A new memory montorng scheme for memory-aware schedulng and parttonng. In HPCA 02: Proceedngs of the 14th Internatonal Symposum on Hgh-Performance Computer Archtecture, pages , 2002.

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access

Cache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?

More information

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory

Virtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process

More information

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier

Some material adapted from Mohamed Younis, UMBC CMSC 611 Spr 2003 course slides Some material adapted from Hennessy & Patterson / 2003 Elsevier Some materal adapted from Mohamed Youns, UMBC CMSC 611 Spr 2003 course sldes Some materal adapted from Hennessy & Patterson / 2003 Elsever Scence Performance = 1 Executon tme Speedup = Performance (B)

More information

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:

Motivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to: 4.1 4.2 Motvaton EE 457 Unt 4 Computer System Performance An ndvdual user wants to: Mnmze sngle program executon tme A datacenter owner wants to: Maxmze number of Mnmze ( ) http://e-tellgentnternetmarketng.com/webste/frustrated-computer-user-2/

More information

Parallelism for Nested Loops with Non-uniform and Flow Dependences

Parallelism for Nested Loops with Non-uniform and Flow Dependences Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr

More information

The Codesign Challenge

The Codesign Challenge ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.

More information

Simulation Based Analysis of FAST TCP using OMNET++

Simulation Based Analysis of FAST TCP using OMNET++ Smulaton Based Analyss of FAST TCP usng OMNET++ Umar ul Hassan 04030038@lums.edu.pk Md Term Report CS678 Topcs n Internet Research Sprng, 2006 Introducton Internet traffc s doublng roughly every 3 months

More information

Mathematics 256 a course in differential equations for engineering students

Mathematics 256 a course in differential equations for engineering students Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the

More information

Wishing you all a Total Quality New Year!

Wishing you all a Total Quality New Year! Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma

More information

Video Proxy System for a Large-scale VOD System (DINA)

Video Proxy System for a Large-scale VOD System (DINA) Vdeo Proxy System for a Large-scale VOD System (DINA) KWUN-CHUNG CHAN #, KWOK-WAI CHEUNG *# #Department of Informaton Engneerng *Centre of Innovaton and Technology The Chnese Unversty of Hong Kong SHATIN,

More information

Maintaining temporal validity of real-time data on non-continuously executing resources

Maintaining temporal validity of real-time data on non-continuously executing resources Mantanng temporal valdty of real-tme data on non-contnuously executng resources Tan Ba, Hong Lu and Juan Yang Hunan Insttute of Scence and Technology, College of Computer Scence, 44, Yueyang, Chna Wuhan

More information

Cluster Analysis of Electrical Behavior

Cluster Analysis of Electrical Behavior Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School

More information

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz

Compiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster

More information

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc.

BioTechnology. An Indian Journal FULL PAPER. Trade Science Inc. [Type text] [Type text] [Type text] ISSN : 0974-74 Volume 0 Issue BoTechnology 04 An Indan Journal FULL PAPER BTAIJ 0() 04 [684-689] Revew on Chna s sports ndustry fnancng market based on market -orented

More information

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data

A Fast Content-Based Multimedia Retrieval Technique Using Compressed Data A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,

More information

Performance Evaluation

Performance Evaluation Performance Evaluaton [Ch. ] What s performance? of a car? of a car wash? of a TV? How should we measure the performance of a computer? The response tme (or wall-clock tme) t takes to complete a task?

More information

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task

Term Weighting Classification System Using the Chi-square Statistic for the Classification Subtask at NTCIR-6 Patent Retrieval Task Proceedngs of NTCIR-6 Workshop Meetng, May 15-18, 2007, Tokyo, Japan Term Weghtng Classfcaton System Usng the Ch-square Statstc for the Classfcaton Subtask at NTCIR-6 Patent Retreval Task Kotaro Hashmoto

More information

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching

A Fast Visual Tracking Algorithm Based on Circle Pixels Matching A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng

More information

AADL : about scheduling analysis

AADL : about scheduling analysis AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng

More information

Memory and I/O Organization

Memory and I/O Organization Memory and I/O Organzaton 8-1 Prncple of Localty Localty small proporton of memory accounts for most run tme Rule of thumb For 9% of run tme next nstructon/data wll come from 1% of program/data closest

More information

Virtual Machine Migration based on Trust Measurement of Computer Node

Virtual Machine Migration based on Trust Measurement of Computer Node Appled Mechancs and Materals Onlne: 2014-04-04 ISSN: 1662-7482, Vols. 536-537, pp 678-682 do:10.4028/www.scentfc.net/amm.536-537.678 2014 Trans Tech Publcatons, Swtzerland Vrtual Machne Mgraton based on

More information

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields

A mathematical programming approach to the analysis, design and scheduling of offshore oilfields 17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and

More information

Assembler. Building a Modern Computer From First Principles.

Assembler. Building a Modern Computer From First Principles. Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought

More information

Hierarchical clustering for gene expression data analysis

Hierarchical clustering for gene expression data analysis Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally

More information

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance

Tsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for

More information

Meta-heuristics for Multidimensional Knapsack Problems

Meta-heuristics for Multidimensional Knapsack Problems 2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,

More information

X- Chart Using ANOM Approach

X- Chart Using ANOM Approach ISSN 1684-8403 Journal of Statstcs Volume 17, 010, pp. 3-3 Abstract X- Chart Usng ANOM Approach Gullapall Chakravarth 1 and Chaluvad Venkateswara Rao Control lmts for ndvdual measurements (X) chart are

More information

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution

Dynamic Voltage Scaling of Supply and Body Bias Exploiting Software Runtime Distribution Dynamc Voltage Scalng of Supply and Body Bas Explotng Software Runtme Dstrbuton Sungpack Hong EE Department Stanford Unversty Sungjoo Yoo, Byeong Bn, Kyu-Myung Cho, Soo-Kwan Eo Samsung Electroncs Taehwan

More information

ELEC 377 Operating Systems. Week 6 Class 3

ELEC 377 Operating Systems. Week 6 Class 3 ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems

More information

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration

Improvement of Spatial Resolution Using BlockMatching Based Motion Estimation and Frame. Integration Improvement of Spatal Resoluton Usng BlockMatchng Based Moton Estmaton and Frame Integraton Danya Suga and Takayuk Hamamoto Graduate School of Engneerng, Tokyo Unversty of Scence, 6-3-1, Nuku, Katsuska-ku,

More information

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research

Scheduling Remote Access to Scientific Instruments in Cyberinfrastructure for Education and Research Schedulng Remote Access to Scentfc Instruments n Cybernfrastructure for Educaton and Research Je Yn 1, Junwe Cao 2,3,*, Yuexuan Wang 4, Lanchen Lu 1,3 and Cheng Wu 1,3 1 Natonal CIMS Engneerng and Research

More information

Determining the Optimal Bandwidth Based on Multi-criterion Fusion

Determining the Optimal Bandwidth Based on Multi-criterion Fusion Proceedngs of 01 4th Internatonal Conference on Machne Learnng and Computng IPCSIT vol. 5 (01) (01) IACSIT Press, Sngapore Determnng the Optmal Bandwdth Based on Mult-crteron Fuson Ha-L Lang 1+, Xan-Mn

More information

Efficient Distributed File System (EDFS)

Efficient Distributed File System (EDFS) Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate

More information

Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs

Utility-Based Acceleration of Multithreaded Applications on Asymmetric CMPs Utlty-Based Acceleraton of Multthreaded Applcatons on Asymmetrc CMPs José A. Joao M. Aater Suleman Onur Mutlu Yale N. Patt ECE Department The Unversty of Texas at Austn Austn, TX, USA {joao, patt}@ece.utexas.edu

More information

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr)

Helsinki University Of Technology, Systems Analysis Laboratory Mat Independent research projects in applied mathematics (3 cr) Helsnk Unversty Of Technology, Systems Analyss Laboratory Mat-2.08 Independent research projects n appled mathematcs (3 cr) "! #$&% Antt Laukkanen 506 R ajlaukka@cc.hut.f 2 Introducton...3 2 Multattrbute

More information

A Binarization Algorithm specialized on Document Images and Photos

A Binarization Algorithm specialized on Document Images and Photos A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a

More information

Machine Learning. Topic 6: Clustering

Machine Learning. Topic 6: Clustering Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess

More information

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers

Content Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth

More information

Self-tuning Histograms: Building Histograms Without Looking at Data

Self-tuning Histograms: Building Histograms Without Looking at Data Self-tunng Hstograms: Buldng Hstograms Wthout Lookng at Data Ashraf Aboulnaga Computer Scences Department Unversty of Wsconsn - Madson ashraf@cs.wsc.edu Surajt Chaudhur Mcrosoft Research surajtc@mcrosoft.com

More information

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.

Assignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009. Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton

More information

Parallel matrix-vector multiplication

Parallel matrix-vector multiplication Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more

More information

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS

NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS ARPN Journal of Engneerng and Appled Scences 006-017 Asan Research Publshng Network (ARPN). All rghts reserved. NUMERICAL SOLVING OPTIMAL CONTROL PROBLEMS BY THE METHOD OF VARIATIONS Igor Grgoryev, Svetlana

More information

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems

Real-time Fault-tolerant Scheduling Algorithm for Distributed Computing Systems Real-tme Fault-tolerant Schedulng Algorthm for Dstrbuted Computng Systems Yun Lng, Y Ouyang College of Computer Scence and Informaton Engneerng Zheang Gongshang Unversty Postal code: 310018 P.R.CHINA {ylng,

More information

A QoS-aware Scheduling Scheme for Software-Defined Storage Oriented iscsi Target

A QoS-aware Scheduling Scheme for Software-Defined Storage Oriented iscsi Target A QoS-aware Schedulng Scheme for Software-Defned Storage Orented SCSI Target Xanghu Meng 1,2, Xuewen Zeng 1, Xao Chen 1, Xaozhou Ye 1,* 1 Natonal Network New Meda Engneerng Research Center, Insttute of

More information

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization

Problem Definitions and Evaluation Criteria for Computational Expensive Optimization Problem efntons and Evaluaton Crtera for Computatonal Expensve Optmzaton B. Lu 1, Q. Chen and Q. Zhang 3, J. J. Lang 4, P. N. Suganthan, B. Y. Qu 6 1 epartment of Computng, Glyndwr Unversty, UK Faclty

More information

Learning-Based Top-N Selection Query Evaluation over Relational Databases

Learning-Based Top-N Selection Query Evaluation over Relational Databases Learnng-Based Top-N Selecton Query Evaluaton over Relatonal Databases Lang Zhu *, Wey Meng ** * School of Mathematcs and Computer Scence, Hebe Unversty, Baodng, Hebe 071002, Chna, zhu@mal.hbu.edu.cn **

More information

Load Balancing for Hex-Cell Interconnection Network

Load Balancing for Hex-Cell Interconnection Network Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,

More information

Burst Round Robin as a Proportional-Share Scheduling Algorithm

Burst Round Robin as a Proportional-Share Scheduling Algorithm Burst Round Robn as a Proportonal-Share Schedulng Algorthm Tarek Helmy * Abdelkader Dekdouk ** * College of Computer Scence & Engneerng, Kng Fahd Unversty of Petroleum and Mnerals, Dhahran 31261, Saud

More information

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment

Resource and Virtual Function Status Monitoring in Network Function Virtualization Environment Journal of Physcs: Conference Seres PAPER OPEN ACCESS Resource and Vrtual Functon Status Montorng n Network Functon Vrtualzaton Envronment To cte ths artcle: MS Ha et al 2018 J. Phys.: Conf. Ser. 1087

More information

Internet Traffic Managers

Internet Traffic Managers Internet Traffc Managers Ibrahm Matta matta@cs.bu.edu www.cs.bu.edu/faculty/matta Computer Scence Department Boston Unversty Boston, MA 225 Jont work wth members of the WING group: Azer Bestavros, John

More information

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS

A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung

More information

Optimizing Document Scoring for Query Retrieval

Optimizing Document Scoring for Query Retrieval Optmzng Document Scorng for Query Retreval Brent Ellwen baellwe@cs.stanford.edu Abstract The goal of ths project was to automate the process of tunng a document query engne. Specfcally, I used machne learnng

More information

Reducing Frame Rate for Object Tracking

Reducing Frame Rate for Object Tracking Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg

More information

An Optimal Algorithm for Prufer Codes *

An Optimal Algorithm for Prufer Codes * J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,

More information

USING GRAPHING SKILLS

USING GRAPHING SKILLS Name: BOLOGY: Date: _ Class: USNG GRAPHNG SKLLS NTRODUCTON: Recorded data can be plotted on a graph. A graph s a pctoral representaton of nformaton recorded n a data table. t s used to show a relatonshp

More information

Computer Architecture ELEC3441

Computer Architecture ELEC3441 Causes of Cache Msses: The 3 C s Computer Archtecture ELEC3441 Lecture 9 Cache (2) Dr. Hayden Kwo-Hay So Department of Electrcal and Electronc Engneerng Compulsory: frst reference to a lne (a..a. cold

More information

S1 Note. Basis functions.

S1 Note. Basis functions. S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type

More information

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.

Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following. Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal

More information

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility

Evaluation of an Enhanced Scheme for High-level Nested Network Mobility IJCSNS Internatonal Journal of Computer Scence and Network Securty, VOL.15 No.10, October 2015 1 Evaluaton of an Enhanced Scheme for Hgh-level Nested Network Moblty Mohammed Babker Al Mohammed, Asha Hassan.

More information

Verification by testing

Verification by testing Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over

More information

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated.

Some Advanced SPC Tools 1. Cumulative Sum Control (Cusum) Chart For the data shown in Table 9-1, the x chart can be generated. Some Advanced SP Tools 1. umulatve Sum ontrol (usum) hart For the data shown n Table 9-1, the x chart can be generated. However, the shft taken place at sample #21 s not apparent. 92 For ths set samples,

More information

An Entropy-Based Approach to Integrated Information Needs Assessment

An Entropy-Based Approach to Integrated Information Needs Assessment Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology

More information

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)

For instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1) Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A

More information

Modeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR

Modeling Multiple Input Switching of CMOS Gates in DSM Technology Using HDMR 1 Modelng Multple Input Swtchng of CMOS Gates n DSM Technology Usng HDMR Jayashree Srdharan and Tom Chen Dept. of Electrcal and Computer Engneerng Colorado State Unversty, Fort Collns, CO, 8523, USA (jaya@engr.colostate.edu,

More information

Improved Resource Allocation Algorithms for Practical Image Encoding in a Ubiquitous Computing Environment

Improved Resource Allocation Algorithms for Practical Image Encoding in a Ubiquitous Computing Environment JOURNAL OF COMPUTERS, VOL. 4, NO. 9, SEPTEMBER 2009 873 Improved Resource Allocaton Algorthms for Practcal Image Encodng n a Ubqutous Computng Envronment Manxong Dong, Long Zheng, Kaoru Ota, Song Guo School

More information

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB

SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB SURFACE PROFILE EVALUATION BY FRACTAL DIMENSION AND STATISTIC TOOLS USING MATLAB V. Hotař, A. Hotař Techncal Unversty of Lberec, Department of Glass Producng Machnes and Robotcs, Department of Materal

More information

Reliability and Energy-aware Cache Reconfiguration for Embedded Systems

Reliability and Energy-aware Cache Reconfiguration for Embedded Systems Relablty and Energy-aware Cache Reconfguraton for Embedded Systems Yuanwen Huang and Prabhat Mshra Department of Computer and Informaton Scence and Engneerng Unversty of Florda, Ganesvlle FL 326-62, USA

More information

User Authentication Based On Behavioral Mouse Dynamics Biometrics

User Authentication Based On Behavioral Mouse Dynamics Biometrics User Authentcaton Based On Behavoral Mouse Dynamcs Bometrcs Chee-Hyung Yoon Danel Donghyun Km Department of Computer Scence Department of Computer Scence Stanford Unversty Stanford Unversty Stanford, CA

More information

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries

Run-Time Operator State Spilling for Memory Intensive Long-Running Queries Run-Tme Operator State Spllng for Memory Intensve Long-Runnng Queres Bn Lu, Yal Zhu, and lke A. Rundenstener epartment of Computer Scence, Worcester Polytechnc Insttute Worcester, Massachusetts, USA {bnlu,

More information

The Research of Support Vector Machine in Agricultural Data Classification

The Research of Support Vector Machine in Agricultural Data Classification The Research of Support Vector Machne n Agrcultural Data Classfcaton Le Sh, Qguo Duan, Xnmng Ma, Me Weng College of Informaton and Management Scence, HeNan Agrcultural Unversty, Zhengzhou 45000 Chna Zhengzhou

More information

High-Boost Mesh Filtering for 3-D Shape Enhancement

High-Boost Mesh Filtering for 3-D Shape Enhancement Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,

More information

arxiv: v3 [cs.ds] 7 Feb 2017

arxiv: v3 [cs.ds] 7 Feb 2017 : A Two-stage Sketch for Data Streams Tong Yang 1, Lngtong Lu 2, Ybo Yan 1, Muhammad Shahzad 3, Yulong Shen 2 Xaomng L 1, Bn Cu 1, Gaogang Xe 4 1 Pekng Unversty, Chna. 2 Xdan Unversty, Chna. 3 North Carolna

More information

A fast algorithm for color image segmentation

A fast algorithm for color image segmentation Unersty of Wollongong Research Onlne Faculty of Informatcs - Papers (Arche) Faculty of Engneerng and Informaton Scences 006 A fast algorthm for color mage segmentaton L. Dong Unersty of Wollongong, lju@uow.edu.au

More information

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search

Can We Beat the Prefix Filtering? An Adaptive Framework for Similarity Join and Search Can We Beat the Prefx Flterng? An Adaptve Framework for Smlarty Jon and Search Jannan Wang Guolang L Janhua Feng Department of Computer Scence and Technology, Tsnghua Natonal Laboratory for Informaton

More information

A Predictable Execution Model for COTS-based Embedded Systems

A Predictable Execution Model for COTS-based Embedded Systems 2011 17th IEEE Real-Tme and Embedded Technology and Applcatons Symposum A Predctable Executon Model for COTS-based Embedded Systems Rodolfo Pellzzon, Emlano Bett, Stanley Bak, Gang Yao, John Crswell, Marco

More information

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation

Quality Improvement Algorithm for Tetrahedral Mesh Based on Optimal Delaunay Triangulation Intellgent Informaton Management, 013, 5, 191-195 Publshed Onlne November 013 (http://www.scrp.org/journal/m) http://dx.do.org/10.36/m.013.5601 Qualty Improvement Algorthm for Tetrahedral Mesh Based on

More information

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks

FAHP and Modified GRA Based Network Selection in Heterogeneous Wireless Networks 2017 2nd Internatonal Semnar on Appled Physcs, Optoelectroncs and Photoncs (APOP 2017) ISBN: 978-1-60595-522-3 FAHP and Modfed GRA Based Network Selecton n Heterogeneous Wreless Networks Xaohan DU, Zhqng

More information

Two-Stage Data Distribution for Distributed Surveillance Video Processing with Hybrid Storage Architecture

Two-Stage Data Distribution for Distributed Surveillance Video Processing with Hybrid Storage Architecture Two-Stage Data Dstrbuton for Dstrbuted Survellance Vdeo Processng wth Hybrd Storage Archtecture Yangyang Gao, Hatao Zhang, Bngchang Tang, Yanpe Zhu, Huadong Ma Bejng Key Lab of Intellgent Telecomm. Software

More information

A Generic and Compositional Framework for Multicore Response Time Analysis

A Generic and Compositional Framework for Multicore Response Time Analysis A Generc and Compostonal Framework for Multcore Response Tme Analyss Sebastan Altmeyer Unversty of Luxembourg Unversty of Amsterdam Clare Maza Grenoble INP Vermag Robert I. Davs Unversty of York INRIA,

More information

Avoiding congestion through dynamic load control

Avoiding congestion through dynamic load control Avodng congeston through dynamc load control Vasl Hnatyshn, Adarshpal S. Seth Department of Computer and Informaton Scences, Unversty of Delaware, Newark, DE 976 ABSTRACT The current best effort approach

More information

Study of Data Stream Clustering Based on Bio-inspired Model

Study of Data Stream Clustering Based on Bio-inspired Model , pp.412-418 http://dx.do.org/10.14257/astl.2014.53.86 Study of Data Stream lusterng Based on Bo-nspred Model Yngme L, Mn L, Jngbo Shao, Gaoyang Wang ollege of omputer Scence and Informaton Engneerng,

More information

Real-time Scheduling

Real-time Scheduling Real-tme Schedulng COE718: Embedded System Desgn http://www.ee.ryerson.ca/~courses/coe718/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrcal and Computer Engneerng Ryerson Unversty Overvew RTX

More information

Load-Balanced Anycast Routing

Load-Balanced Anycast Routing Load-Balanced Anycast Routng Chng-Yu Ln, Jung-Hua Lo, and Sy-Yen Kuo Department of Electrcal Engneerng atonal Tawan Unversty, Tape, Tawan sykuo@cc.ee.ntu.edu.tw Abstract For fault-tolerance and load-balance

More information

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity

Corner-Based Image Alignment using Pyramid Structure with Gradient Vector Similarity Journal of Sgnal and Informaton Processng, 013, 4, 114-119 do:10.436/jsp.013.43b00 Publshed Onlne August 013 (http://www.scrp.org/journal/jsp) Corner-Based Image Algnment usng Pyramd Structure wth Gradent

More information

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China

COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES. University, Beijing , China COMPARISON OF TWO MODELS FOR HUMAN EVACUATING SIMULATION IN LARGE BUILDING SPACES Bn Zhao 1, 2, He Xao 1, Yue Wang 1, Yuebao Wang 1 1 Department of Buldng Scence and Technology, Tsnghua Unversty, Bejng

More information

Space-Optimal, Wait-Free Real-Time Synchronization

Space-Optimal, Wait-Free Real-Time Synchronization 1 Space-Optmal, Wat-Free Real-Tme Synchronzaton Hyeonjoong Cho, Bnoy Ravndran ECE Dept., Vrgna Tech Blacksburg, VA 24061, USA {hjcho,bnoy}@vt.edu E. Douglas Jensen The MITRE Corporaton Bedford, MA 01730,

More information

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems

An Efficient Garbage Collection for Flash Memory-Based Virtual Memory Systems S. J and D. Shn: An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems 2355 An Effcent Garbage Collecton for Flash Memory-Based Vrtual Memory Systems Seunggu J and Dongkun Shn, Member,

More information

A Deflected Grid-based Algorithm for Clustering Analysis

A Deflected Grid-based Algorithm for Clustering Analysis A Deflected Grd-based Algorthm for Clusterng Analyss NANCY P. LIN, CHUNG-I CHANG, HAO-EN CHUEH, HUNG-JEN CHEN, WEI-HUA HAO Department of Computer Scence and Informaton Engneerng Tamkang Unversty 5 Yng-chuan

More information

Multiple Sub-Row Buffers in DRAM: Unlocking Performance and Energy Improvement Opportunities

Multiple Sub-Row Buffers in DRAM: Unlocking Performance and Energy Improvement Opportunities Multple Sub-Row Buffers n DRAM: Unlockng Performance and Energy Improvement Opportuntes ABSTRACT Nagendra Gulur Texas Instruments (Inda) nagendra@t.com Mahesh Mehendale Texas Instruments (Inda) m-mehendale@t.com

More information

y and the total sum of

y and the total sum of Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton

More information

A Parallelization Design of JavaScript Execution Engine

A Parallelization Design of JavaScript Execution Engine , pp.171-184 http://dx.do.org/10.14257/mue.2014.9.7.15 A Parallelzaton Desgn of JavaScrpt Executon Engne Duan Huca 1,2, N Hong 2, Deng Feng 2 and Hu Lnln 2 1 Natonal Network New eda Engneerng Research

More information

Pricing Network Resources for Adaptive Applications in a Differentiated Services Network

Pricing Network Resources for Adaptive Applications in a Differentiated Services Network IEEE INFOCOM Prcng Network Resources for Adaptve Applcatons n a Dfferentated Servces Network Xn Wang and Hennng Schulzrnne Columba Unversty Emal: {xnwang, schulzrnne}@cs.columba.edu Abstract The Dfferentated

More information

Review of approximation techniques

Review of approximation techniques CHAPTER 2 Revew of appromaton technques 2. Introducton Optmzaton problems n engneerng desgn are characterzed by the followng assocated features: the objectve functon and constrants are mplct functons evaluated

More information

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach

Skew Angle Estimation and Correction of Hand Written, Textual and Large areas of Non-Textual Document Images: A Novel Approach Angle Estmaton and Correcton of Hand Wrtten, Textual and Large areas of Non-Textual Document Images: A Novel Approach D.R.Ramesh Babu Pyush M Kumat Mahesh D Dhannawat PES Insttute of Technology Research

More information

A Statistical Model Selection Strategy Applied to Neural Networks

A Statistical Model Selection Strategy Applied to Neural Networks A Statstcal Model Selecton Strategy Appled to Neural Networks Joaquín Pzarro Elsa Guerrero Pedro L. Galndo joaqun.pzarro@uca.es elsa.guerrero@uca.es pedro.galndo@uca.es Dpto Lenguajes y Sstemas Informátcos

More information

Shared Running Buffer Based Proxy Caching of Streaming Sessions

Shared Running Buffer Based Proxy Caching of Streaming Sessions Shared Runnng Buffer Based Proxy Cachng of Streamng Sessons Songqng Chen, Bo Shen, Yong Yan, Sujoy Basu Moble and Meda Systems Laboratory HP Laboratores Palo Alto HPL-23-47 March th, 23* E-mal: sqchen@cs.wm.edu,

More information

Hybrid Non-Blind Color Image Watermarking

Hybrid Non-Blind Color Image Watermarking Hybrd Non-Blnd Color Image Watermarkng Ms C.N.Sujatha 1, Dr. P. Satyanarayana 2 1 Assocate Professor, Dept. of ECE, SNIST, Yamnampet, Ghatkesar Hyderabad-501301, Telangana 2 Professor, Dept. of ECE, AITS,

More information

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*

Configuration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations* Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad

More information

Available online at Available online at Advanced in Control Engineering and Information Science

Available online at   Available online at   Advanced in Control Engineering and Information Science Avalable onlne at wwwscencedrectcom Avalable onlne at wwwscencedrectcom Proceda Proceda Engneerng Engneerng 00 (2011) 15000 000 (2011) 1642 1646 Proceda Engneerng wwwelsevercom/locate/proceda Advanced

More information

Support Vector Machines

Support Vector Machines /9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.

More information