Utility-Based Hybrid Memory Management

Size: px
Start display at page:

Download "Utility-Based Hybrid Memory Management"

Transcription

1 Uiliy-Based Hybrid Memory Managemen Yang Li Saugaa Ghose Jongmoo Choi Jin Sun Hui Wang Onur Mulu Carnegie Mellon Universiy Dankook Universiy Beihang Universiy ETH Zürich While he memory fooprins of cloud and HPC applicaions coninue o increase, fundamenal issues wih DRAM scaling are likely o preven radiional main memory sysems, composed of monolihic DRAM, from grealy growing in capaciy. Hybrid memory sysems can miigae he scaling limiaions of monolihic DRAM by pairing ogeher muliple memory echnologies (e.g., differen ypes of DRAM, or DRAM and non-volaile memory) a he same level of he memory hierarchy. The goal of a hybrid main memory is o combine he differen advanages of he muliple memory ypes in a cos-effecive manner while avoiding he disadvanages of each echnology. Memory pages are placed in and migraed beween he differen memories wihin a hybrid memory sysem, based on he properies of each page. I is imporan o make inelligen page managemen (i.e., placemen and migraion) decisions, as hey can significanly affec sysem performance. In his paper, we propose uiliy-based hybrid memory managemen (UH-MEM), a new page managemen mechanism for various hybrid memories, ha sysemaically esimaes he uiliy (i.e., he sysem performance benefi) of migraing a page beween differen memory ypes, and uses his informaion o guide daa placemen. UH-MEM operaes in wo seps. Firs, i esimaes how much a single applicaion would benefi from migraing one of is pages o a differen ype of memory, by comprehensively considering access frequency, row buffer localiy, and memory-level parallelism. Second, i ranslaes he esimaed benefi of a single applicaion o an esimae of he overall sysem performance benefi from such a migraion. We evaluae he effeciveness of UH-MEM wih various ypes of hybrid memories, and show ha i significanly improves sysem performance on each of hese hybrid memories. For a memory sysem wih DRAM and non-volaile memory, UH- MEM improves performance by 14% on average (and up o 26%) compared o he bes of hree evaluaed sae-of-he-ar mechanisms across a large number of daa-inensive workloads. 1. Inroducion Modern large-scale compuing clusers coninue o employ dynamic random access memory (DRAM) as he main memory sysem wihin each server. However, as he amoun of memory consumed by he applicaions running on hese clusers (e.g., high-performance compuing workloads, largescale daa analyics) grows, radiional DRAM-based memory sysems are unlikely o be able o keep up wih his growh. DRAM scaling is expeced o become increasingly difficul [90, 91] due o increasing cell leakage curren [42, 65, 66, 97], reduced cell reliabiliy [46, 76, 91, 113], and increasing manufacuring complexiy [37, 41, 46, 74, 90, 91, 96, 107]. As a resul, oher memory soluions have emerged o offer low-laency, low-power, or high-capaciy subsraes wihou heavily relying on DRAM scaling. New DRAM producs such as 3D-sacked DRAM [3,45,60,61,99], reduced-laency DRAM (RLDRAM) [80], and low-power DRAM (LPDRAM) [82] make use of novel DRAM circui design, archiecures, and inerfaces o beer caer o applicaions such as scienific compuing, daa mining, nework raffic, and mobile compuing. In addiion, emerging non-volaile memory (NVM) echnologies (e.g., PCM [53, 54, 55, 104, 124], STT-RAM [52], ReRAM [68], and 3D XPoin [83]) have shown promise for fuure main memory sysem designs o mee increasing memory capaciy demands of daa-inensive workloads. Wih projeced scaling rends, NVM cells can be manufacured more easily a smaller feaure sizes han DRAM cells, achieving high densiy and capaciy [14, 15, 52, 53, 54, 55, 68, 104, 107, 120, 124, 131]. However, hese new memory echnologies are unlikely o fully replace commodiy DRAM in main memory sysems. For example, 3D-sacked DRAM is limied in capaciy [12]. RLDRAM has higher cos-per-bi han commodiy DRAM [8,49,58,59]. Mos NVMs incur high access laency and high dynamic energy consumpion, and some NVM echnologies have limied wrie endurance. To address hese weaknesses, hybrid memory sysems or heerogeneous memory sysems, comprised of boh commodiy DRAM and one of hese alernaive memory echnologies, have been proposed. A hybrid memory sysem aims o combine he benefis of boh of is componen memory ypes in a cos-effecive manner [104,126]. For example, commodiy DRAM is faser han NVM, bu has a higher cos per bi. A hybrid memory wih boh commodiy DRAM and NVM uilizes a small amoun of DRAM and a large amoun of NVM, o provide he illusion ha he sysem has large memory capaciy (of NVM), and ha all daa can be accessed a low laency (of DRAM). Hybrid memory sysems can poenially mee boh he performance and memory capaciy (as well as memory energy efficiency) needs of large-scale compuing clusers [4, 5, 31, 33, 64, 73, 75, 98, 100, 104, 126]. In order o successfully deliver high memory capaciy a low laency, hybrid memory sysems mus make inelligen daa placemen decisions, choosing wheher each page should be placed in he high capaciy memory or in he fas memory. Previous daa managemen proposals for hybrid memories consider only a limied number of characerisics, using hese few daa poins o consruc a placemen heurisic ha is specific o he memory ypes being used in he sysem. For example, he majoriy of prior work on hybrid DRAM

2 NVM main memory sysems eiher reas DRAM as a convenional cache [104] or places daa wih high access frequency, high wrie inensiy, and/or low row buffer localiy in DRAM [20, 39, 106, 126, 129], while placing he remaining daa in NVM, as he access laency of NVM is generally higher han ha of DRAM [53, 104]. A mechanism for combining commodiy DRAM wih 3D-sacked DRAM organizes he faser 3D-sacked DRAM as a page-granulariy cache of he commodiy DRAM, bu idenifies and places only he cache blocks ha will be accessed in 3D-sacked DRAM [38]. Work on combining RLDRAM wih commodiy DRAM idenifies and places only criical daa words ino he RLDRAM o reduce access laency [11]. These heurisic-based approaches do no direcly capure he overall sysem performance benefis of daa placemen decisions (as we will show in Secion 3). Therefore, hey can only indirecly opimize sysem performance, which someimes leads o sub-opimal daa placemen decisions. For example, le us consider a memory manager ha migraes memory pages ha are accessed frequenly [39] and ha inherenly have a high access laency (i.e., hey have low row buffer localiy) [126] from he slower NVM o he faser commodiy DRAM. A page migraion based on only hese wo heurisics may no improve sysem performance, if, for insance, accesses o he page being migraed are compleely overlapped wih oher requess from he same applicaion ha coninue o access he slower NVM. In such a case, he laency reducion for accesses o he migraed page would no reduce he applicaion s execuion ime, as he applicaion sill needs o wai for he accesses o he slower NVM o complee. The example memory manager is unable o capure his overlap wih is simple heurisics, and hus incorrecly decides o migrae he page in his example. Our goal in his work is o devise a generalized mechanism ha direcly esimaes he overall sysem performance benefi of migraing a page beween any wo ypes of memory, and places only he performance-criical daa in he fases memory wihin he hybrid main memory sysem. To his end, we propose uiliy-based hybrid memory managemen (UH-MEM), a new hardware mechanism ha esimaes he marginal performance uiliy of each page (i.e., he sysem performance benefi of migraing he page o a faser memory ype), and migraes only hose pages wih he greaes uiliy. UH-MEM employs wo seps. Firs, i deermines how much migraing a page belonging o ha individual applicaion would improve he applicaion s performance. To do his, UH-MEM uses a new performance model ha considers several facors, including how frequenly each page is accessed, wheher row buffer localiy impacs he performance benefis of migraion, and how much he page access laency is hidden by overlapping requess (i.e., he level of memory-level parallelism, or MLP [13, 57, 87, 92, 93, 94]). Second, UH-MEM esimaes how much he improvemen of a single applicaion s performance benefis he overall sysem performance, as differen workloads have differen amouns of impac on overall sysem performance. UH-MEM migraes hose pages wih he greaes esimaed sysem-level performance benefi from slow memory ino fas memory. Key Resuls. We exensively evaluae UH-MEM using a wide range of hybrid memory configuraions, and show ha i is effecive a improving sysem performance over sae-ofhe-ar hybrid memory managers. We quaniaively show ha for a memory sysem wih boh convenional DRAM and NVM, UH-MEM improves sysem performance by 14% on average (and up o 26%) compared o he bes of hree sae-of-he-ar mechanisms ha we evaluae (a convenional cache inserion mechanism [104], an access frequency based mechanism [39, 106], and a row buffer localiy based mechanism [126]), for a large number of daa-inensive workloads. We also show ha he hardware cos of UH-MEM is very modes ( 40KB in our baseline sysem). In his paper, we make hree main conribuions: We propose he firs general uiliy meric o esimae he poenial sysem performance benefi of migraing a page beween he differen memories wihin a hybrid main memory sysem. This uiliy meric represens he sysem performance benefi as a funcion of (1) an applicaion s sall ime reducion if he accessed page is migraed o a faser ype of memory, and (2) how an improvemen o a single applicaion s sall ime impacs overall sysem performance. We propose a new performance model ha can be implemened in hardware, which comprehensively considers he access frequency, row buffer localiy, and MLP of a page o sysemaically esimae an applicaion s sall ime reducion from migraing he page. This is he firs work o consider MLP in addiion o access frequency, row buffer localiy, and wrie inensiy, and o model he ineracions beween hem, for page placemen decisions. Based on our new meric and new performance model, we propose he firs uiliy-based hybrid memory managemen mechanism, UH-MEM, which selecively places pages ha are mos beneficial o overall sysem performance in fas memory wihin a hybrid memory sysem. Our mechanism is general, and works wih a wide variey of memory ypes ha can be used in a hybrid memory sysem. We quaniaively demonsrae ha UH-MEM ouperforms hree sae-of-he-ar hybrid memory managemen echniques. 2. Background In his secion, we provide background on he organizaion and managemen of hybrid memory sysems. Figure 1 shows an example hybrid memory sysem. This hybrid memory sysem has wo differen ypes of memory, which we call Memory A and Memory B. One of hese memories (we arbirarily choose Memory A) is faser han he oher, while he oher memory (Memory B) has a greaer capaciy due o is higher densiy. The goal of a hybrid memory sysem is o provide he large main memory capaciy of Memory B, while providing he fas access laencies of Memory A for memory accesses ha affec execuion ime. 2

3 Channel A Memory A (Fas, Small) Cores/Caches Memory Conrollers Row Buffer Bank Channel B Memory B (Large, Slow) Figure 1: A ypical hybrid memory sysem. When a memory reques is issued by a processor (e.g., he CPU), he memory conrollers deermine wheher he reques should be sen o Memory A or Memory B. Each memory has is own memory channel (i.e., a bus ha connecs he memory o is respecive memory conroller), and is inernally organized similar o oday s DRAM. 1 Each memory consiss of muliple banks, where each bank is a wo-dimensional array of memory cells organized ino rows and columns. Each bank can operae in parallel, bu all banks wihin a channel share he address, daa, and command buses. Wihin each bank, here is an inernal buffer called he row buffer. When daa is accessed from a bank, he enire row conaining he daa is brough ino he row buffer. Hence, a subsequen access o daa from he same row can be served from he row buffer and need no access he array. Such an access is called a row buffer hi. If a subsequen access is o daa in a differen row, he conens of he row buffer need o be wrien back o he row, and he new row s conens need o be brough ino he row buffer. Such an access is called a row buffer conflic (or row buffer miss). A row buffer miss incurs a much higher laency han a row buffer hi. Previous works on hybrid memory sysems observe ha he laency of a row buffer hi is similar across memory ypes, while he laency of a row buffer conflic/miss is generally much higher in denser memories [53,54,55,78,79,126]. The fracion of row buffer his ou of all memory accesses o a row is called row buffer localiy. We can expec ha migraing a page wih low row buffer localiy o he fas memory benefis performance, as a low-localiy page experiences more row buffer misses, and such misses are serviced a a lower laency in he fas memory. Conversely, we can expec ha migraing a page wih high row buffer localiy does no benefi performance much, as mos of he accesses o such a high-localiy page hi in he row buffer, and a row buffer hi has a similar laency in boh he fas memory and he slow memory [126]. An imporan issue for a hybrid memory sysem is how o manage daa sored in differen memory devices. In our sudy, we adop he configuraion proposed by Qureshi e al. [104], and organize he fas, small memory (Memory A) as a cache for he pages in he large, slow memory (Memory B). We assume ha all pages are iniially in Memory B. Insead of uncondiionally migraing a page when he page is accessed [69, 77,102,104], we selecively migrae pages ino 1 We refer he reader o prior works for he deailed inernal operaion, organizaion, and conrol of DRAM [9, 10, 34, 46, 49, 59, 66, 88, 93, 111]. Memory A based on some meric, which is he uiliy of he page in our proposal. This migraion may rigger he evicion of a vicim page cached in Memory A, which is handled by he cache replacemen policy of Memory A. We discuss our migraion mechanism in Secion 4.1. The migraion process beween memory devices is fully managed by hardware, and is ransparen o he OS. 3. Moivaion In sysems ha can issue muliple memory requess in parallel (e.g., ou-of-order execuion processors, mulicore processors, runahead processors), he number of cycles saved for a single memory reques does no direcly ranslae ino a reducion in he applicaion s execuion ime. In order o esimae he rue uiliy of a page (i.e., he impac ha migraing ha page has on sysem performance), we need o esimae (1) by how much he laency reducion from migraion would reduce he individual applicaion s execuion ime (i.e., he applicaion s sall ime reducion), and (2) by how much he applicaion s sall ime reducion ranslaes o an improvemen in overall sysem performance (i.e., he sensiiviy of overall sysem performance o each applicaion s sall ime). In his secion, we firs demonsrae ha we need o comprehensively consider hree major facors, i.e., access frequency, row buffer localiy, and memory-level parallelism (MLP), o esimae he sall ime reducion a page provides when migraed. These facors were no fully capured in prior works [20, 39, 106, 126, 129], none of which ry o esimae he effec of migraion on applicaion or sysem performance. Then, we show ha overall sysem performance exhibis differen sensiiviy o differen applicaions sall ime reducions, and ha we wan o migrae pages from applicaions wih high sensiiviy o maximize overall sysem performance Comprehensive Sall Time Esimaion of an Applicaion To he firs order, an applicaion s sall ime reducion depends on wo pars: (1) how much he laency for accessing he page can be reduced, and (2) how his laency overlaps wih he laencies of oher memory requess from he applicaion. For he firs par, since only he row buffer miss accesses can achieve shorer laency afer he migraion, we need o comprehensively consider access frequency and row buffer localiy of he page (i.e., we can coun he number of row buffer misses o he page) o esimae he laency reducion for he memory requess o he page. The second par depends on he parallelism of memory requess from an applicaion (MLP). MLP is he number of concurren ousanding requess (i.e., he in-fligh memory requess ha are ye o be compleed) from he same applicaion [13,30,87,92,93,94]. In our mechanism, we consider he MLP for each page, and check how many concurren requess from he same applicaion ypically exis when he page is accessed. If here are many concurren requess, he access laency o he page is likely o overlap wih he access laency o oher pages, and herefore migraing he page o fas memory, while i may reduce is 3

4 access laency, will likely resul in only a limied or small reducion in he applicaion s sall ime. We illusrae his MLP effec using he concepual example in Figure 2. Pages 0, 1, and 2 all have he same number of row buffer miss requess. Requess o Page 0 are no overlapped wih oher requess from he same applicaion, while requess o Pages 1 and 2 are overlapped. We would like o see by how much he applicaion s sall ime would be reduced if we migrae each of hese pages from slow memory o fas memory. Before migraion: reques o Page 0 Afer migraion: reques o Page 0 T Applicaion sall ime reduced by T (a) (a) alone Alone reques reques reques o Page 1 reques o Page 2 reques o Page 1 reques o Page 2 T Applicaion sall ime reduced by T (b) (b) overlapped Overlapped requess requess Figure 2: Concepual example showing ha he MLP of a page influences how much effec is migraion o fas memory has on he applicaion sall ime. Suppose we migrae Page 0 o fas memory (Figure 2a). As here is no oher reques ha overlaps wih he reques o Page 0, he reques o Page 0 is likely o be salling a he head of he processor reorder buffer (ROB), which ofen salls he enire applicaion [29, 51, 87, 92, 94, 95, 103]. The requess o Page 0 will complee faser upon migraion, hereby decreasing he applicaion s sall ime and hus being more likely o improve applicaion performance [29, 51, 87, 92, 94, 95, 103]. On he oher hand, if we migrae boh Pages 1 and 2 o fas memory (Figure 2b), requess o boh pages also complee faser, bu he applicaion s overall sall ime will be reduced by roughly he same amoun as ha enabled by migraing only Page 0, since he access laencies o Pages 1 and 2 are overlapped. In oher words, despie incurring double he number of migraions and consuming double he amoun of limied fas memory capaciy by migraing wo overlapping pages (Pages 1 and 2), we achieve only he same performance benefi enabled by migraing only a single page ha is serviced alone (Page 0). Unforunaely, wihou MLP, we are unable o build a comprehensive model ha disinguishes beween hese wo scenarios, and mechanisms ha consider only row buffer localiy and access frequency may migrae pages like Pages 1 and 2 ha conribue less o reducing he applicaion s sall ime. 2 Figure 3 shows he disribuion of MLP across all memory pages for hree represenaive benchmarks: soplex, xalan- 2 In fac, if a mechanism migraes only one of he overlapping pages (eiher Page 1 or Page 2), i is unlikely ha i will reduce sall ime a all as he non-migraed page would sill sall he CPU. A similar observaion is made by Qureshi e al. in he conex of caching [103]. cbmk, and YCSB-B [16, 35]. 3 We can see ha differen pages wihin an applicaion have very differen MLP. Oher benchmarks in our evaluaion exhibi similar MLP diversiy across heir pages. Hence, we can ake advanage of his diversiy o opimize sysem performance. Frequency (%) MLP Frequency (%) MLP MLP (a) soplex (b) xalancbmk (c) YCSB-B Figure 3: MLP disribuion for all pages in hree workloads. In order o quanify he impac of differen facors on an applicaion s sall ime, we measure he sall ime conribuion of each page (i.e., he ime ha he ousanding memory requess o he page cause he processor o sall) for every benchmark in our evaluaion. Table 1 shows he correlaion coefficiens beween he average sall ime per page and hree differen page-level access characerisic merics (i.e., access frequency, row buffer localiy, and MLP, along wih combinaions of he hree). 4 This shows ha independenly, access frequency, row buffer localiy, and MLP all correlae somewha wih a page s sall ime conribuion. However, his correlaion becomes very srong when we comprehensively consider all hree facors ogeher (correlaion coefficien = 0.92). We see ha he wo facors considered ogeher in prior work (access frequency and row buffer localiy) [126] do no correlae nearly as srongly (correlaion coefficien = 0.76). Therefore, we conclude ha access frequency, row buffer localiy, and MLP are all indispensable facors o comprehensively model he performance impac of daa placemen. Frequency (%) AF RBL MLP Correlaion AF+RBL AF+MLP AF+RBL+MLP Correlaion Table 1: Absolue Spearman correlaion coefficiens beween he average sall ime per page and differen facors (AF: access frequency; RBL: row buffer localiy; MLP: memory level parallelism). The correlaion coefficiens are beween 0 and 1, where 0 = no correlaion, and 1 = perfec correlaion. 3 We run each workload separaely on a sysem ha is similar o he configuraion shown in Secion 5, hough we use a single-core processor for he experimens shown here. When a page in he workload is accessed by a memory reques, we measure how many ousanding memory requess wih he same ype (i.e., eiher read or wrie) exis in he workload, and use ha number as he curren MLP of he page. We hen calculae he average MLP of each page, and repor he disribuion of average MLP across all of he pages in hese figures. 4 For each benchmark, we divide all of is pages ino several bins, sored by he values of he facors under consideraion. We hen calculae he average sall ime per page for each bin. We analyze he correlaion beween he average sall ime and he facors, and obain he correlaion coefficien. We repor he average correlaion coefficien over all of our benchmarks

5 3.2. Esimaing Effec on Overall Sysem Performance Prior proposals for hybrid memory page managemen ha only use heurisics ha are, as we have shown in Secion 3.1, only somewha correlaed o applicaion performance [11, 20, 38, 39, 104, 106, 126, 127, 128, 129] fail o capure how he sall ime of a single applicaion affecs overall sysem performance. We find ha his impac is no uniform across he applicaions wihin a muliprogrammed workload. There are several differen merics ha can be used o express sysem performance, as has been discussed in a number of prior works [6, 26, 71, 112] (e.g., weighed speedup, harmonic speedup). These merics express overall sysem performance by weighing he performance of each applicaion wihin he workload differenly, based on some applicaion characerisics. For example, weighed speedup normalizes he performance of each applicaion o is performance when running alone, in order o capure he effecs of sysem inerference beween applicaions [26, 112]. For wo applicaions wih an equal amoun of sall ime reducion (in erms of absolue cycle coun), he reducion for he applicaion wih a greaer weigh will resul in a greaer sysem performance improvemen. As prior page managemen mechanisms are oblivious o he unequal impac of applicaion performance benefis on overall sysem performance, hey can migrae pages ha are less imporan for overall sysem performance ino he fas memory. We, herefore, incorporae he relaion beween applicaion performance and overall sysem performance direcly ino our mechanism, using applicaion weighing o prioriize pages from applicaions ha impac he overall sysem performance he mos. In his work, we use weighed speedup [112], which has been shown o correspond o sysem hroughpu for muliprogrammed workloads [26]. However, sysem designers wih oher arge objecives can use differen sysem performance merics, by simply modifying he sysem performance esimaion hardware wihin our proposed mechanism. 4. UH-MEM: Uiliy-Based Hybrid Memory Managemen In his secion, we inroduce uiliy-based hybrid memory managemen (UH-MEM). UH-MEM is a hardware mechanism ha resides wihin he memory conroller. I performs inerval-based calculaions o deermine which pages should be migraed from slow memory o fas memory, where fas memory is reaed as a se-associaive (16-way) page cache wih LRU cache replacemen policy, similar o prior work [77, 104, 126]. During each inerval (1 million cycles in our experimens, deermined empirically), pages are seleced for migraion by UH-MEM, and a migraion mechanism caches he daa in he fas memory by copying he daa firs o he migraion buffer in he memory conroller, and hen o he fas memory. Once a page is migraed o fas memory, i is insered ino a ag sore wihin he memory conroller. Whenever a reques misses in he las-level on-chip cache, i looks up he ag sore and he migraion buffer, o see if he requesed daa resides in fas memory or in he migraion buffer. The reques is hen dispached o he appropriae locaion based on his lookup. As wih on-chip caches, UH-MEM s operaions are ransparen o he OS Mechanism Overview UH-MEM comprehensively esimaes how he migraion of each page would improve overall sysem performance, which we define as he uiliy of each page (see Secion 3). The page uiliy calculaion, as performed in hardware, is described in deail in Secion 4.2. During each inerval, when a page is accessed in slow memory, UH-MEM migraes he page o fas memory if is uiliy is greaer han he migraion hreshold. I is no beneficial o move every accessed page ino fas memory, because (1) migraion operaions ake ime o complee, and (2) doing so would cause he slow memory bandwidh o go unused. We include a mechanism o dynamically se he migraion hreshold a he end of each inerval, which we discuss in Secion 4.3. When a page is seleced for migraion, we firs check he ag sore of he fas memory o see if we need o evic anoher page in he desinaion fas memory cache se. We implemen a migraion buffer wihin he memory conroller o emporarily hold he migraing page(s). Each cache block in he buffer includes wo migraion saus bis o deermine where he cache block currenly resides (i.e., in eiher of he memories, or in he buffer). The saus bis allow UH-MEM o direc incoming memory requess for a migraing page o he correc place. Afer compleing he daa movemen, he corresponding meadaa informaion in he ag sore is updaed Compuing Page Uiliy The uiliy of a page depends on (1) he sall ime reducion of an applicaion due o migraion of he page o he fas memory, and (2) he sysem performance sensiiviy o he applicaion. 5 Suppose ha one page of Applicaion i is migraed o fas memory, such ha he applicaion sall ime is reduced by Sall Time i. The uiliy of ha page (U ) can be expressed as: U = Sall Time i Sensiiviy i (1) Esimaing Applicaion Sall Time Reducion. The sall ime reducion due o a page migraion is dependen on wo facors: (1) he access laency reducion for ha page, and (2) he degree o which he page s access laency is masked (i.e., overlapped) by he access laency of oher concurren requess for he same applicaion. The degree o which a page s oal access laency is reduced can be deermined by using a combinaion of he page s access frequency and row buffer localiy. If a page is migraed from slow memory o fas memory, he laency of row buffer 5 Wihou loss of generaliy, we use he erm applicaion o refer o a hardware hread conex execuing an applicaion. 5

6 misses decreases, while row buffer his sill achieve a similar laency. Therefore, he expeced decrease in access laency is proporional o he oal number of row buffer misses for ha page, which is a funcion of access frequency and row buffer localiy. We can esimae his decrease as: Read Laency = #ReadMiss ( slow,read fas,read ) Wrie Laency = #WrieMiss ( slow,wrie fas,wrie ) where #ReadMiss and #WrieMiss are he number of row buffer read and wrie misses, respecively, and fas,read, fas,wrie, slow,read, and slow,wrie are he device-specific read/wrie laencies incurred on a row buffer miss for fas memory and slow memory, respecively. In order o quanify he degree of access laency masking, we sample he oal number of ousanding memory requess for ha same applicaion o model he overlap effec. Specifically, we define he MLP raio of an applicaion o be he reciprocal of he ousanding memory reques coun. 6 Inuiively, if here are fewer ousanding requess, hen here is less memory-level parallelism available o overlap he page s access laency. As such, we use he reciprocal of he number of ousanding memory requess so ha he MLP raio represens he fracion of he access laency ha impacs he applicaion s performance. During a sampling period, he MLP raio for an applicaion wih N read, /N wrie, ousanding read/wrie requess is as follows, respecively for reads and wries: 1 1 MLPRaio read, = MLPRaio wrie, = (3) N read, N wrie, We can use he MLP raio of he applicaion o deermine he MLP raio for individual pages. For mos applicaions, differen pages do no ypically have equal amouns of MLP. Therefore, we approximae an average MLP raio for each page across all of he sampling periods ha have aken place so far in he curren inerval. We compue wo values, PageMLPRaio read and PageMLPRaio wrie, which are he average MLP raio of a page during he inerval for ousanding read and wrie requess, respecively, o ha page. We can model PageMLPRaio read and PageMLPRaio wrie as: m MLPRaio read, m read, read, N read, PageMLPRaio read = = m read, MLPRaio wrie, m wrie, PageMLPRaio wrie = = m wrie, m read, m wrie, N wrie, m wrie, (2) (4) To calculae PageMLPRaio read, we sar wih he overall applicaion MLP raio a each sampling period (MLPRaio read, ). We deermine he oal conribuion of he page o he applicaion s MLP during sampling period by muliplying MLPRaio read, wih he number of ousanding read requess during he sampling period o he page (m read, ). We hen sum up he page s MLP conribuions over all of he sampling periods so far in he curren inerval, and divide i by he oal number of ousanding read requess o he page during hese sampling periods. This, in effec, gives us he average MLP conribuion of each ousanding read reques for he page. We repea he same calculaion for wrie requess. We can now combine he laency reducion (Equaion 2) and he average MLP raio (Equaion 4) o deermine he sall ime reducion for Applicaion i as a resul of migraing a paricular page: Sall Time i = Read Laency PageMLPRaio read + p Wrie Laency PageMLPRaio wrie (5) where p represens he probabiliy ha he wrie requess appear on he criical pah. Prior work [130] has shown ha his probabiliy is dependen on an applicaion s wrie access paern, and is generally larger if he applicaion has a large number of wrie requess. For simpliciy, we choose o se p = 1, hough using an online ieraive approach o deermine p [130] may yield beer performance since i can enhance he accuracy of he sall ime esimaion. Equaion 5 shows ha he sall ime reducion due o a page migraion from slow memory o fas memory can be deermined by using a combinaion of access frequency, row buffer localiy, and MLP for each page. Inuiively, a high access frequency and low row buffer localiy increase he number of oal row buffer misses, hus enlarging he benefis of migraing o fas memory. Likewise, poor MLP, wih fewer concurren ousanding requess, increases he average MLP raio due o low likelihood of overlapping he reques laency, and also increases he benefis from migraion Esimaing Sysem Performance Sensiiviy. For muliprogrammed workloads, we use he weighed speedup meric [27, 112] o characerize sysem performance. 7 For each applicaion, he speedup componen of Applicaion i is he raio of execuion ime when running alone, i.e., wihou inerference from oher applicaions (T alone,i ) o ha when running ogeher wih oher applicaions (T shared,i ): T alone,i Sysem Performance = Speedup i = (6) T shared,i i i 6 We calculae he MLP raio separaely for reads and wries, o accoun for heir differen behavior in main memory. While reads are ofen serviced as soon as possible (as hey can fall along he criical pah of execuion), wries are deferred, and are evenually drained in baches [56, 110]. Disinguishing beween reads and wries allows us o more accuraely deermine he MLP behavior affecing each ype of reques. 7 UH-MEM can be adaped o use differen sysem performance or fairness merics [22, 24, 32, 47, 48, 86, 88, 93, 116, 117, 121, 125]. In order o suppor differen sysem performance merics, we can implemen logic o esimae he sensiiviy for each meric, and le he OS choose he mos suiable meric o opimize based on he applicaions currenly running wihin he sysem and he user s preferences. 6

7 When Applicaion i migraes a page o fas memory, he speedup of ha applicaion improves by : Speedup i = T alone,i T shared,i Since he sall ime reducion due o page migraion is generally much smaller han he execuion ime ( T alone,i, T shared,i ), we can perform a Taylor expansion o find he change in speedup: Speedup i = Speedup i T alone,i Speedup i = (T shared,i )T shared,i T alone,i = Speedup i T shared,i T shared,i T shared,i We defined he performance sensiiviy of he sysem o an applicaion in Secion 3.1 as he measure of how he change in an applicaion s sall ime impacs he overall sysem performance. We can hus esimae i using Equaion 9 (by plugging in Equaion 8 a he appropriae place): Sensiiviy i = Performance Sall Time i = Speedup i (7) (8) = Speedup i T shared,i (9) We calculae he performance sensiiviy using an inervalbased approach, where he speedup (Speedup i ) and execuion ime (T shared,i ) obained in he las inerval are used o esimae performance sensiiviy in he curren inerval. The execuion ime of each applicaion running on he sysem is equal o he lengh of an inerval. We need o esimae he speedup of he applicaion (Speedup i ) during he inerval. This speedup esimae can be obained by using prior proposals [22, 23, 84, 88, 118, 119]. These works consider he impac of memory inerference and/or cache conenion on he speedup of an applicaion. In our implemenaion, we esimae speedup based on he approach in [88]. Equaions 5 and 9 are combined using Equaion 1 o give us he overall uiliy of migraing he page in quesion. A few measuremens are required o obain his uiliy calculaion, and we discuss he implemenaion deails of hese mechanisms in Secion Performing Page Migraion Algorihm 1 summarizes how UH-MEM decides which pages i should move o he fas memory. Whenever an ousanding memory reques complees, UH-MEM (1) updaes couners ha hold saisics for he page accessed by he reques, (2) recalculaes he uiliy of he page, and (3) compares he calculaed uiliy wih he migraion hreshold. The page will only be migraed from slow memory o fas memory if he uiliy exceeds he migraion hreshold. A he end of each inerval, UH-MEM adjuss he migraion hreshold o accoun for ransien applicaion behavior, and clears he page saisic couners. Algorihm 1 Migraing pages wih UH-MEM. 1: for every inerval do 2: for every compleed memory reques do 3: Updae he corresponding page s saisics couners 4: Calculae he page s uiliy (Secion 4.2) 5: if he page s uiliy exceeds he migraion hreshold hen 6: Migrae he page o he fas memory 7: end if 8: end for 9: if a he end of he inerval hen 10: Adjus he migraion hreshold (Secion 4.3) 11: Esimae speedup for each applicaion (Secion 4.2.2) 12: Rese all couners o zero 13: end if 14: end for A key quesion is how o deermine his migraion hreshold. We choose o use a hill climbing based approach o deermine his hreshold dynamically, similar o he policy used by Yoon e al. [126]. We use he oal sall ime of all applicaions in each inerval o reflec he sysem performance. A he end of each inerval, he oal sall ime is recalculaed. We hen compare he curren oal sall ime wih he oal sall ime from he previous inerval, and deermine wheher he previous hreshold adjusmen yielded a sysem performance improvemen. If he oal sall ime of he curren inerval is lower (meaning ha he hreshold adjusmen improved sysem performance), we coninue o adjus he hreshold in he same direcion. Oherwise, since he previous adjusmen degraded performance, we move he hreshold in he opposie direcion Hardware Srucures UH-MEM performs he calculaions described in Secion 4.2 in hardware. We firs discuss he various hardware componens required for UH-MEM o calculae he MLP raios and page uiliy. Then, we summarize he oal cos of he hardware MLP Raio Calculaion. To calculae he MLP raios from Equaion 4, we mus mainain four emporary couners for every page wih ousanding requess in he memory conroller. Two of he couners, MLPAcc read and MLPAcc wrie, accumulae he numeraor from Equaion 4, while he oher wo couners, MLPWeigh read and MLPWeigh wrie, accumulae he denominaor of he equaion, as follows: MLPAcc read = MLPAcc wrie = m read, N read, m wrie, N wrie, MLPWeigh read = m read, MLPWeigh wrie = (10) m wrie, For every sampling period (30 cycles in our experimens), we monior boh he ousanding read/wrie requess N read and N wrie for each applicaion, as well as he ousanding requess m read and m wrie for each page, and updae he corresponding couners. 7

8 When all he ousanding requess o a page have compleed, he conens of he page s emporary couners are added o is corresponding couners in a saisics sore (i.e., sas sore), and are hen rese. The sas sore is a 32-way seassociaive cache wih LRU replacemen policy, residing in he memory conroller. Each sas sore enry corresponds o a page, and consiss of six couners ha record he number of row buffer misses, he sum of weighed MLP raios (MLPAcc), and he sum of weighs for he MLP raios (MLPWeigh) for read/wrie requess. We can use he raio of MLPAcc o MLPWeigh o calculae he average MLP raio of he page (PageMLPRaio), respecively for read and wrie requess. When a page in slow memory is accessed, if i has an exising enry in he sas sore, he conen of is enry is updaed; oherwise, an enry is allocaed, which may evic he enry of he leas recenly used page wihin he se. The access laency o he sas sore is no on he criical pah, as we updae he sas sore in he background. When a sysem has muliple memory conrollers, he sas sore and he couners used o calculae MLP raios need o be shared by hese memory conrollers. Differen memory conrollers need o communicae wih each oher o mainain he informaion, such as he number of ousanding requess, as done in prior works [17, 36, 47, 85, 86] Uiliy Calculaion for Shared Pages. For pages shared by muliple applicaions, we can use separae enries in he sas sore o record he saisical informaion of he page wih respec o each applicaion. We can use our previous mehod o calculae he page uiliy for each applicaion, and hen add hese uiliy values o obain he aggregae uiliy for he page. The insigh is ha he oal sysem performance improvemen correlaes wih he sum of he performance improvemen of each applicaion. Therefore, summing up he page uiliy for each applicaion (i.e., is performance improvemen) should reflec he sysem performance improvemen Hardware Cos. Table 2 describes he main hardware coss for UH-MEM. The larges componen is he sas sore. We use a 2048-enry sas sore (organized as 32-way seassociaive cache), as i leads o negligible performance degradaion compared wih an unlimied-size sas sore. The main hardware cos of UH-MEM is 42.87KB, 8 which is only approximaely 2% of our baseline sysem s L2 cache size. UH-MEM also requires hardware logic o calculae he MLP raios. For each page wih ousanding requess in slow memory (96 a mos; limied by he read reques queue size and wrie buffer), we need o perform 4 25-bi addiions and 2 fas divisions every 30 cycles o compue he MLP raios. 9 We achieve his by pipelining he logic, and making i 3- way superscalar. We can implemen fas division using a ROM able ha conains he precompued resuls of he division, since boh he numeraor and denominaor of he division are limied by he MSHR size of he las-level cache. As each quoien is 10 bis wide, he oal size of such a ROM able is 1.25KB. UH-MEM does no require any modificaions o he operaing sysem o suppor page migraion. This is because UH-MEM does no use he virual or physical address of a page o deermine wheher he page resides in fas memory or slow memory. Insead, UH-MEM uses a dedicaed hardware ag sore in he memory conroller o deermine wheher he page has been migraed o he fas memory. 5. Evaluaion Mehodology Similar o prior works [39, 104, 106, 126], we evaluae our proposed UH-MEM mechanism using a cycle-accurae x86 mulicore simulaor [2], whose fron end is based on Pin [70]. We released our simulaor [2, 109]. This in-house developed simulaor is similar o Ramulaor [1, 50], which is a widelyacceped open-source mulicore simulaor ha models he main memory sysem in deail. In our simulaor, page migraions beween fas and slow memories are modeled as addiional read requess o he memory device where he page is currenly locaed, o read he enire page from i, followed by addiional wrie requess in he desinaion memory device o wrie he enire page. The laency for deermining wheher a page resides in fas or slow memory is modeled as six cycles. Table 3 summarizes he major parameers of he baseline sysem consising of DRAM and NVM in our 8 This does no include he hardware used o deermine wheher a page resides in fas memory or slow memory, as his hardware is required by mos hybrid memory managemen mechanisms [104, 106, 126], and he implemenaion of UH-MEM is orhogonal o he implemenaion of his srucure. 9 We deermined all values empirically and did no opimize heavily. Reducion in hardware cos is possible wih careful opimizaion. Name Purpose Srucure (number of bis in parenheses) Size Sas sore Couners for ousanding pages in slow memory ROM able for MLP raios Tracks saisical informaion for recenly-accessed pages Records updaes of MLPAcc and MLPWeigh for pages wih ousanding requess Sores precompued resuls of division used o calculae MLP raios Toal Hardware Cos (for our evaluaed sysem in Table 3) 2048 enries; each enry consiss of read row buffer miss coun (14), 40.00KB wrie row miss coun (14), MLPAcc read (30), MLPAcc wrie (30), MLPWeigh read (21), MLPWeigh wrie (21) and page number ag (30) For each page wih ousanding requess in slow memory (96 a 1.62KB mos), MLPAcc read (30), MLPAcc wrie (30), MLPWeigh read (21), MLPWeigh wrie (21) and page number (36) 32 x 32 enries; each enry consumes 10 bis 1.25KB Table 2: Main hardware cos of UH-MEM KB 8

9 Processor L1 Cache L2 Cache Fas Memory Conroller Slow Memory Conroller Baseline Fas Memory Sysem Baseline Slow Memory Sysem 8 cores, 2.67GHz, 3-wide issue, 128-enry insrucion window 32KB per core, 4-way, 64B cache block 256KB per core, 8-way, 32 MSHR enries per core, 64B cache block 64-bi channel, 64-enry read reques queue, 32- enry wrie buffer, FR-FCFS scheduling policy [108, 132] 64-bi channel, 64-enry read reques queue, 32- enry wrie buffer, FR-FCFS scheduling policy [108, 132] 512MB DRAM, 1 rank (8 banks), CLK =1.875ns, CL =15ns, RCD =15ns, RP =15ns, WR =15ns, array read (wrie) energy = 1.17 (0.39) pj/bi, row buffer read (wrie) energy = 0.93 (1.02) pj/bi 16GB NVM, 1 rank (8 banks), CLK =1.875ns, CL =15ns, RCD =67.5ns, RP =15ns, WR =180ns, array read (wrie) energy = 2.47 (16.82) pj/bi, row buffer read (wrie) energy = 0.93 (1.02) pj/bi Table 3: Baseline sysem parameers. evaluaion. The deailed DRAM and NVM iming and energy parameers are based on prior sudies [53, 54, 78, 79, 81]. We calculae he saic power of he hybrid memory sysem o be 5.6W [53]. In order o evaluae differen ypes of hybrid memory sysems, such as DRAM RLDRAM and DRAM NVM memories, we vary he size of he fas memory and he read/wrie wrie laency raios of slow memory o fas memory. We also measure he performance of our evaluaed page placemen mechanisms under hese differen configuraions Workloads We use 30 benchmarks chosen from SPEC CPU2006 [35] and he Yahoo Cloud Serving Benchmark (YCSB) suie [16]. We classify hem as memory-inensive or non-memory-inensive based on heir las level cache misses per 1K insrucions (MPKI) when running alone. Each experimen runs an eighapplicaion workload on he sysem, wih one applicaion running on each core. The memory inensiy caegory of he workload is deermined by he percenage of memoryinensive benchmarks wihin he workload. For example, a workload has 75% inensiy if i consiss of six memoryinensive benchmarks and wo non-memory-inensive benchmarks. We generae 40 workloads, eigh for each caegory of workload memory inensiy (0%, 25%, 50%, 75%, 100%). In each experimen, every benchmark was warmed up for 500 million insrucions, and hen execued for anoher 500 million insrucions. A benchmark in a muliprogrammed workload is resared afer i complees unil all he benchmarks in he workload complee once Merics We use weighed speedup (WSpeedup) [26, 112] and maximum slowdown (MaxSlowdown) [6, 17, 18, 43, 44, 47, 48, 86, 116, 117, 119, 121, 123] o evaluae sysem performance and unfairness, respecively, using he equaions shown below. N is he number of cores; IPC alone,i and IPC shared,i are he insrucions compleed per cycle (IPC) when Applicaion i is running alone and running wih oher applicaions, respecively. Weighed speedup (see Secion 4.2) firs weighs he performance of each applicaion (when i is running wih ohers; IPC shared,i ) by he reciprocal of is performance while running alone (IPC alone,i ), reflecing he speedup of he applicaion. Then, weighed speedup sums up he speedup of all he applicaions, reflecing he overall sysem performance. Weighed speedup is a widely-used muliprogrammed sysem performance meric in compuer archiecure evaluaion [26]. I quanifies sysem hroughpu [26]. For unfairness, we use maximum slowdown o quanify he wors-case slowdown of any applicaion in a muliprogrammed workload. Boh weighed speedup and maximum slowdown use normalized IPC raios, insead of he IPC iself, o avoid biasing eiher meric in favor of high-ipc or low-ipc applicaions. N 1 WSpeedup = i=0 MaxSlowdown = max 6. Experimenal Resuls IPC shared,i IPC alone,i ( ) IPCalone,i IPC shared,i We evaluae our proposed UH-MEM mechanism across a wide variey of sysem configuraions, covering several fas memory sizes and laency raios of slow memory o fas memory. Throughou our evaluaion, we compare UH-MEM o hree oher sae-of-he-ar mechanisms: ALL: a convenional cache inserion mechanism. This mechanism reas fas memory as a cache o slow memory, and insers all he pages accessed in slow memory ino fas memory using he LRU replacemen policy. This is similar o he proposal by Qureshi e al. [104]. FREQ: an access frequency based mechanism. This mechanism migraes pages wih high access frequency o fas memory. I is similar o wo proposals ha ry o improve he emporal localiy in fas memory and reduce he number of accesses o slow memory [39, 106]. RBLA: a row buffer localiy based mechanism [126]. This mechanism migraes pages ha have experienced a large number of row buffer misses in slow memory o fas memory. The inuiion is ha only he laency of row buffer miss requess can be reduced when he page is migraed o fas memory Resuls on he Baseline Sysem Configuraion Figure 4 shows he normalized weighed speedup of he four evaluaed mechanisms on he baseline sysem configuraion, averaged for each workload inensiy caegory. UH-MEM ouperforms he bes previous proposal, RBLA, in all workload caegories wih non-zero memory inensiy. For he mos memory-inensive caegory, UH-MEM provides a 14% 9

10 ALL FREQ RBLA UH-MEM Normalized WSpeedup % 25% 50% 75% 100% Workload Memory Inensiy Caegory Figure 4: Normalized weighed speedup for he baseline configuraion. ALL FREQ RBLA UH-MEM Memory Energy (J) % 25% 50% 75% 100% Workload Memory Inensiy Caegory Figure 7: Memory energy consumpion for he baseline configuraion. ALL FREQ RBLA UH-MEM Average App Sall Time (x10^9 cycles) % 25% 50% 75% 100% Workload Memory Inensiy Caegory Figure 5: Average applicaion sall ime for he baseline configuraion. ALL FREQ RBLA UH-MEM WSpeedup MB 512MB 1GB 2GB Fas Memory Capaciy Figure 8: Weighed speedup for various fas memory sizes. ALL FREQ RBLA UH-MEM Normalized Unfairness % 25% 50% 75% 100% Workload Memory Inensiy Caegory Figure 6: Normalized unfairness for he baseline configuraion. ALL FREQ RBLA UH-MEM WSpeedup RCD: WR: x3.0 x4.0 x4.5 x6.0 x7.5 x3.0 x4.0 x12 x16 x20 Slow Memory Laency Muliplier Figure 9: Weighed speedup for various slow-o-fas memory laency raios for RCD and WR. average performance improvemen over RBLA. The maximum performance gain of UH-MEM over RBLA for a single workload is 26%. UH-MEM s performance advanage is wofold. Firs, UH-MEM no only considers he laency of each individual reques (as FREQ and RBLA do), bu also akes ino accoun he memory-level parallelism beween requess o esimae each reques s individual conribuion o he applicaion s overall sall ime. Therefore, UH-MEM can reduce sall ime more effecively compared wih hose prior proposals by selecing and caching hose pages ha are more likely o sall he processor. This is demonsraed by Figure 5, which shows ha each applicaion wihin a workload salls for less wih UH-MEM han wih RBLA. Second, UH-MEM is aware of which applicaions impac he sysem performance he mos as i esimaes sysem performance sensiiviy o differen applicaions, and prioriizes page migraions from hose applicaions ha are likely o benefi sysem performance he mos. Figure 6 shows he normalized unfairness of he four evaluaed mechanisms on he baseline sysem configuraion. We can see ha UH-MEM achieves equivalen or improved fairness compared o all prior proposals. We also sudy he energy efficiency of he four mechanisms on he baseline sysem configuraion. Figure 7 shows he memory energy consumpion of he four mechanisms on workloads wih varying memory inensiies. We observe ha energy consumpion grows wih he memory inensiy of he workload. Compared o prior mechanisms, UH-MEM consumes similar energy for non-memory-inensive workloads, and uses less energy for memory-inensive workloads. For he memory-inensive workloads, UH-MEM reduces saic energy consumpion as a resul of is shorer execuion ime. UH-MEM also reduces he dynamic energy consumed due o page migraions, as i selecively migraes he imporan pages o DRAM insead of migraing less imporan pages as he baseline mechanisms do. We conclude ha UH-MEM improves performance and lowers energy consumpion compared o hree sae-of-hear hybrid memory managemen mechanisms, because i can effecively gauge he sysem performance benefi of each page migraion Sensiiviy o Fas Memory Size The fas memory size deermines he room for performance opimizaion in hybrid memory sysems. A larger fas memory can allow more pages o migrae from slow memory, hereby likely offering greaer sysem performance. However, he fas memory size, in pracice, canno be oo large, and herefore can limi he scalabiliy of hybrid memory sysems. In his secion, we evaluae how each mechanism performs across a range of fas memory sizes (256MB, 512MB, 1GB, and 2GB). Figure 8 shows he weighed speedup of workloads wih 100% memory inensiy under various fas memory sizes. We observe ha sysem performance increases wih fas memory size. Under he four evaluaed sizes, UH-MEM ouperforms RBLA by 14%, 14%, 12%, and 12%, respecively. Even for a 256MB fas memory, which offers less opporuniy for opimizaion, UH-MEM achieves a weighed speedup of 3.30, which is larger han RBLA s weighed speedup of 3.04 for a 2GB fas memory. In oher words, UH-MEM can exceed RBLA s performance even wih only an eighh of he fas memory capaciy. This implies ha by esimaing he sysem performance benefi of each page and selecively placing only criical pages in fas memory, UH-MEM can grealy shrink he fas memory size (while achieving higher performance), and hereby improve hybrid memory scalabiliy. 10

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II

CS 152 Computer Architecture and Engineering. Lecture 7 - Memory Hierarchy-II CS 152 Compuer Archiecure and Engineering Lecure 7 - Memory Hierarchy-II Krse Asanovic Elecrical Engineering and Compuer Sciences Universiy of California a Berkeley hp://www.eecs.berkeley.edu/~krse hp://ins.eecs.berkeley.edu/~cs152

More information

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding

Image segmentation. Motivation. Objective. Definitions. A classification of segmentation techniques. Assumptions for thresholding Moivaion Image segmenaion Which pixels belong o he same objec in an image/video sequence? (spaial segmenaion) Which frames belong o he same video sho? (emporal segmenaion) Which frames belong o he same

More information

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012

Scheduling. Scheduling. EDA421/DIT171 - Parallel and Distributed Real-Time Systems, Chalmers/GU, 2011/2012 Lecture #4 Updated March 16, 2012 EDA421/DIT171 - Parallel and Disribued Real-Time Sysems, Chalmers/GU, 2011/2012 Lecure #4 Updaed March 16, 2012 Aemps o mee applicaion consrains should be done in a proacive way hrough scheduling. Schedule

More information

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report)

Implementing Ray Casting in Tetrahedral Meshes with Programmable Graphics Hardware (Technical Report) Implemening Ray Casing in Terahedral Meshes wih Programmable Graphics Hardware (Technical Repor) Marin Kraus, Thomas Erl March 28, 2002 1 Inroducion Alhough cell-projecion, e.g., [3, 2], and resampling,

More information

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional);

Network management and QoS provisioning - QoS in Frame Relay. . packet switching with virtual circuit service (virtual circuits are bidirectional); QoS in Frame Relay Frame relay characerisics are:. packe swiching wih virual circui service (virual circuis are bidirecional);. labels are called DLCI (Daa Link Connecion Idenifier);. for connecion is

More information

A time-space consistency solution for hardware-in-the-loop simulation system

A time-space consistency solution for hardware-in-the-loop simulation system Inernaional Conference on Advanced Elecronic Science and Technology (AEST 206) A ime-space consisency soluion for hardware-in-he-loop simulaion sysem Zexin Jiang a Elecric Power Research Insiue of Guangdong

More information

CS 152 Computer Architecture and Engineering. Lecture 6 - Memory

CS 152 Computer Architecture and Engineering. Lecture 6 - Memory CS 152 Compuer Archiecure and Engineering Lecure 6 - Memory Krse Asanovic Elecrical Engineering and Compuer Sciences Universiy of California a Berkeley hp://www.eecs.berkeley.edu/~krse hp://ins.eecs.berkeley.edu/~cs152

More information

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR

PART 1 REFERENCE INFORMATION CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONITOR . ~ PART 1 c 0 \,).,,.,, REFERENCE NFORMATON CONTROL DATA 6400 SYSTEMS CENTRAL PROCESSOR MONTOR n CONTROL DATA 6400 Compuer Sysems, sysem funcions are normally handled by he Monior locaed in a Peripheral

More information

A Matching Algorithm for Content-Based Image Retrieval

A Matching Algorithm for Content-Based Image Retrieval A Maching Algorihm for Conen-Based Image Rerieval Sue J. Cho Deparmen of Compuer Science Seoul Naional Universiy Seoul, Korea Absrac Conen-based image rerieval sysem rerieves an image from a daabase using

More information

4. Minimax and planning problems

4. Minimax and planning problems CS/ECE/ISyE 524 Inroducion o Opimizaion Spring 2017 18 4. Minima and planning problems ˆ Opimizing piecewise linear funcions ˆ Minima problems ˆ Eample: Chebyshev cener ˆ Muli-period planning problems

More information

Coded Caching with Multiple File Requests

Coded Caching with Multiple File Requests Coded Caching wih Muliple File Requess Yi-Peng Wei Sennur Ulukus Deparmen of Elecrical and Compuer Engineering Universiy of Maryland College Park, MD 20742 ypwei@umd.edu ulukus@umd.edu Absrac We sudy a

More information

Simple Network Management Based on PHP and SNMP

Simple Network Management Based on PHP and SNMP Simple Nework Managemen Based on PHP and SNMP Krasimir Trichkov, Elisavea Trichkova bsrac: This paper aims o presen simple mehod for nework managemen based on SNMP - managemen of Cisco rouer. The paper

More information

Improving the Efficiency of Dynamic Service Provisioning in Transport Networks with Scheduled Services

Improving the Efficiency of Dynamic Service Provisioning in Transport Networks with Scheduled Services Improving he Efficiency of Dynamic Service Provisioning in Transpor Neworks wih Scheduled Services Ralf Hülsermann, Monika Jäger and Andreas Gladisch Technologiezenrum, T-Sysems, Goslarer Ufer 35, D-1585

More information

Performance Evaluation of Implementing Calls Prioritization with Different Queuing Disciplines in Mobile Wireless Networks

Performance Evaluation of Implementing Calls Prioritization with Different Queuing Disciplines in Mobile Wireless Networks Journal of Compuer Science 2 (5): 466-472, 2006 ISSN 1549-3636 2006 Science Publicaions Performance Evaluaion of Implemening Calls Prioriizaion wih Differen Queuing Disciplines in Mobile Wireless Neworks

More information

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks

Less Pessimistic Worst-Case Delay Analysis for Packet-Switched Networks Less Pessimisic Wors-Case Delay Analysis for Packe-Swiched Neworks Maias Wecksén Cenre for Research on Embedded Sysems P O Box 823 SE-31 18 Halmsad maias.wecksen@hh.se Magnus Jonsson Cenre for Research

More information

CS 152 Computer Architecture and Engineering. Lecture 6 - Memory

CS 152 Computer Architecture and Engineering. Lecture 6 - Memory CS 152 Compuer Archiecure and Engineering Lecure 6 - Memory Krse Asanovic Elecrical Engineering and Compuer Sciences Universiy of California a Berkeley hp://www.eecs.berkeley.edu/~krse hp://ins.eecs.berkeley.edu/~cs152

More information

4 Error Control. 4.1 Issues with Reliable Protocols

4 Error Control. 4.1 Issues with Reliable Protocols 4 Error Conrol Jus abou all communicaion sysems aemp o ensure ha he daa ges o he oher end of he link wihou errors. Since i s impossible o build an error-free physical layer (alhough some shor links can

More information

EECS 487: Interactive Computer Graphics

EECS 487: Interactive Computer Graphics EECS 487: Ineracive Compuer Graphics Lecure 7: B-splines curves Raional Bézier and NURBS Cubic Splines A represenaion of cubic spline consiss of: four conrol poins (why four?) hese are compleely user specified

More information

An Adaptive Spatial Depth Filter for 3D Rendering IP

An Adaptive Spatial Depth Filter for 3D Rendering IP JOURNAL OF SEMICONDUCTOR TECHNOLOGY AND SCIENCE, VOL.3, NO. 4, DECEMBER, 23 175 An Adapive Spaial Deph Filer for 3D Rendering IP Chang-Hyo Yu and Lee-Sup Kim Absrac In his paper, we presen a new mehod

More information

COSC 3213: Computer Networks I Chapter 6 Handout # 7

COSC 3213: Computer Networks I Chapter 6 Handout # 7 COSC 3213: Compuer Neworks I Chaper 6 Handou # 7 Insrucor: Dr. Marvin Mandelbaum Deparmen of Compuer Science York Universiy F05 Secion A Medium Access Conrol (MAC) Topics: 1. Muliple Access Communicaions:

More information

Difficulty-aware Hybrid Search in Peer-to-Peer Networks

Difficulty-aware Hybrid Search in Peer-to-Peer Networks Difficuly-aware Hybrid Search in Peer-o-Peer Neworks Hanhua Chen, Hai Jin, Yunhao Liu, Lionel M. Ni School of Compuer Science and Technology Huazhong Univ. of Science and Technology {chenhanhua, hjin}@hus.edu.cn

More information

The Impact of Product Development on the Lifecycle of Defects

The Impact of Product Development on the Lifecycle of Defects The Impac of Produc Developmen on he Lifecycle of Rudolf Ramler Sofware Compeence Cener Hagenberg Sofware Park 21 A-4232 Hagenberg, Ausria +43 7236 3343 872 rudolf.ramler@scch.a ABSTRACT This paper invesigaes

More information

Definition and examples of time series

Definition and examples of time series Definiion and examples of ime series A ime series is a sequence of daa poins being recorded a specific imes. Formally, le,,p be a probabiliy space, and T an index se. A real valued sochasic process is

More information

Using CANopen Slave Driver

Using CANopen Slave Driver CAN Bus User Manual Using CANopen Slave Driver V1. Table of Conens 1. SDO Communicaion... 1 2. PDO Communicaion... 1 3. TPDO Reading and RPDO Wriing... 2 4. RPDO Reading... 3 5. CANopen Communicaion Parameer

More information

Chapter 8 LOCATION SERVICES

Chapter 8 LOCATION SERVICES Disribued Compuing Group Chaper 8 LOCATION SERVICES Mobile Compuing Winer 2005 / 2006 Overview Mobile IP Moivaion Daa ransfer Encapsulaion Locaion Services & Rouing Classificaion of locaion services Home

More information

Voltair Version 2.5 Release Notes (January, 2018)

Voltair Version 2.5 Release Notes (January, 2018) Volair Version 2.5 Release Noes (January, 2018) Inroducion 25-Seven s new Firmware Updae 2.5 for he Volair processor is par of our coninuing effors o improve Volair wih new feaures and capabiliies. For

More information

STEREO PLANE MATCHING TECHNIQUE

STEREO PLANE MATCHING TECHNIQUE STEREO PLANE MATCHING TECHNIQUE Commission III KEY WORDS: Sereo Maching, Surface Modeling, Projecive Transformaion, Homography ABSTRACT: This paper presens a new ype of sereo maching algorihm called Sereo

More information

Learning in Games via Opponent Strategy Estimation and Policy Search

Learning in Games via Opponent Strategy Estimation and Policy Search Learning in Games via Opponen Sraegy Esimaion and Policy Search Yavar Naddaf Deparmen of Compuer Science Universiy of Briish Columbia Vancouver, BC yavar@naddaf.name Nando de Freias (Supervisor) Deparmen

More information

I. INTRODUCTION. Keywords -- Web Server, Perceived User Latency, HTTP, Local Measuring. interchangeably.

I. INTRODUCTION. Keywords -- Web Server, Perceived User Latency, HTTP, Local Measuring. interchangeably. Evaluaing Web User Perceived Laency Using Server Side Measuremens Marik Marshak 1 and Hanoch Levy School of Compuer Science Tel Aviv Universiy, Tel-Aviv, Israel mmarshak@emc.com, hanoch@pos.au.ac.il 1

More information

LOW-VELOCITY IMPACT LOCALIZATION OF THE COMPOSITE TUBE USING A NORMALIZED CROSS-CORRELATION METHOD

LOW-VELOCITY IMPACT LOCALIZATION OF THE COMPOSITE TUBE USING A NORMALIZED CROSS-CORRELATION METHOD 21 s Inernaional Conference on Composie Maerials Xi an, 20-25 h Augus 2017 LOW-VELOCITY IMPACT LOCALIZATION OF THE COMPOSITE TUBE USING A NORMALIZED CROSS-CORRELATION METHOD Hyunseok Kwon 1, Yurim Park

More information

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding

Analysis of Various Types of Bugs in the Object Oriented Java Script Language Coding Indian Journal of Science and Technology, Vol 8(21), DOI: 10.17485/ijs/2015/v8i21/69958, Sepember 2015 ISSN (Prin) : 0974-6846 ISSN (Online) : 0974-5645 Analysis of Various Types of Bugs in he Objec Oriened

More information

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The

source managemen, naming, proecion, and service provisions. This paper concenraes on he basic processor scheduling aspecs of resource managemen. 2 The Virual Compuers A New Paradigm for Disribued Operaing Sysems Banu Ozden y Aaron J. Goldberg Avi Silberschaz z 600 Mounain Ave. AT&T Bell Laboraories Murray Hill, NJ 07974 Absrac The virual compuers (VC)

More information

Chapter 4 Sequential Instructions

Chapter 4 Sequential Instructions Chaper 4 Sequenial Insrucions The sequenial insrucions of FBs-PLC shown in his chaper are also lised in secion 3.. Please refer o Chaper, "PLC Ladder diagram and he Coding rules of Mnemonic insrucion",

More information

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version

Test - Accredited Configuration Engineer (ACE) Exam - PAN-OS 6.0 Version Tes - Accredied Configuraion Engineer (ACE) Exam - PAN-OS 6.0 Version ACE Exam Quesion 1 of 50. Which of he following saemens is NOT abou Palo Alo Neworks firewalls? Sysem defauls may be resored by performing

More information

User Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System

User Adjustable Process Scheduling Mechanism for a Multiprocessor Embedded System Proceedings of he 6h WSEAS Inernaional Conference on Applied Compuer Science, Tenerife, Canary Islands, Spain, December 16-18, 2006 346 User Adjusable Process Scheduling Mechanism for a Muliprocessor Embedded

More information

Assignment 2. Due Monday Feb. 12, 10:00pm.

Assignment 2. Due Monday Feb. 12, 10:00pm. Faculy of rs and Science Universiy of Torono CSC 358 - Inroducion o Compuer Neworks, Winer 218, LEC11 ssignmen 2 Due Monday Feb. 12, 1:pm. 1 Quesion 1 (2 Poins): Go-ack n RQ In his quesion, we review how

More information

A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER

A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER ABSTRACT Modern graphics cards for compuers, and especially heir graphics processing unis (GPUs), are designed for fas rendering of graphics.

More information

C 1. Last Time. CSE 490/590 Computer Architecture. Cache I. Branch Delay Slots (expose control hazard to software)

C 1. Last Time. CSE 490/590 Computer Architecture. Cache I. Branch Delay Slots (expose control hazard to software) CSE 490/590 Compuer Archiecure Cache I Seve Ko Compuer Sciences and Engineering Universiy a Buffalo Las Time Pipelining hazards Srucural hazards hazards Conrol hazards hazards Sall Bypass Conrol hazards

More information

MIC2569. Features. General Description. Applications. Typical Application. CableCARD Power Switch

MIC2569. Features. General Description. Applications. Typical Application. CableCARD Power Switch CableCARD Power Swich General Descripion is designed o supply power o OpenCable sysems and CableCARD hoss. These CableCARDs are also known as Poin of Disribuion (POD) cards. suppors boh Single and Muliple

More information

Po,,ll. I Appll I APP2 I I App3 I. Illll Illlllll II Illlll Illll Illll Illll Illll Illll Illll Illll Illll Illll Illll Illlll Illl Illl Illl

Po,,ll. I Appll I APP2 I I App3 I. Illll Illlllll II Illlll Illll Illll Illll Illll Illll Illll Illll Illll Illll Illll Illlll Illl Illl Illl Illll Illlllll II Illlll Illll Illll Illll Illll Illll Illll Illll Illll Illll Illll Illlll Illl Illl Illl US 20110153728A1 (19) nied Saes (12) Paen Applicaion Publicaion (10) Pub. No.: S 2011/0153728

More information

An Efficient Delivery Scheme for Coded Caching

An Efficient Delivery Scheme for Coded Caching 201 27h Inernaional Teleraffic Congress An Efficien Delivery Scheme for Coded Caching Abinesh Ramakrishnan, Cedric Wesphal and Ahina Markopoulou Deparmen of Elecrical Engineering and Compuer Science, Universiy

More information

Opportunistic Flooding in Low-Duty-Cycle Wireless Sensor Networks with Unreliable Links

Opportunistic Flooding in Low-Duty-Cycle Wireless Sensor Networks with Unreliable Links 1 in Low-uy-ycle Wireless Sensor Neworks wih Unreliable Links Shuo uo, Suden Member, IEEE, Liang He, Member, IEEE, Yu u, Member, IEEE, o Jiang, Suden Member, IEEE, and Tian He, Member, IEEE bsrac looding

More information

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab

Y. Tsiatouhas. VLSI Systems and Computer Architecture Lab CMOS INEGRAED CIRCUI DESIGN ECHNIQUES Universiy of Ioannina Clocking Schemes Dep. of Compuer Science and Engineering Y. siaouhas CMOS Inegraed Circui Design echniques Overview 1. Jier Skew hroughpu Laency

More information

Sam knows that his MP3 player has 40% of its battery life left and that the battery charges by an additional 12 percentage points every 15 minutes.

Sam knows that his MP3 player has 40% of its battery life left and that the battery charges by an additional 12 percentage points every 15 minutes. 8.F Baery Charging Task Sam wans o ake his MP3 player and his video game player on a car rip. An hour before hey plan o leave, he realized ha he forgo o charge he baeries las nigh. A ha poin, he plugged

More information

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS

FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS FIELD PROGRAMMABLE GATE ARRAY (FPGA) AS A NEW APPROACH TO IMPLEMENT THE CHAOTIC GENERATORS Mohammed A. Aseeri and M. I. Sobhy Deparmen of Elecronics, The Universiy of Ken a Canerbury Canerbury, Ken, CT2

More information

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. XX, NO. XX, XX XXXX 1

IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. XX, NO. XX, XX XXXX 1 This is he auhor's version of an aricle ha has been published in his journal. Changes were made o his version by he publisher prior o publicaion. IEEE TRANSACTIONS ON WIRELESS COMMUNICATIONS, VOL. XX,

More information

Adaptive Workflow Scheduling on Cloud Computing Platforms with Iterative Ordinal Optimization

Adaptive Workflow Scheduling on Cloud Computing Platforms with Iterative Ordinal Optimization Adapive Workflow Scheduling on Cloud Compuing Plaforms wih Ieraive Ordinal Opimizaion Fan Zhang, Senior Member, IEEE; Junwei Cao, Senior Member, IEEE; Kai Hwang, Fellow, IEEE; Keqin Li, Senior Member,

More information

Optimal Crane Scheduling

Optimal Crane Scheduling Opimal Crane Scheduling Samid Hoda, John Hooker Laife Genc Kaya, Ben Peerson Carnegie Mellon Universiy Iiro Harjunkoski ABB Corporae Research EWO - 13 November 2007 1/16 Problem Track-mouned cranes move

More information

Michiel Helder and Marielle C.T.A Geurts. Hoofdkantoor PTT Post / Dutch Postal Services Headquarters

Michiel Helder and Marielle C.T.A Geurts. Hoofdkantoor PTT Post / Dutch Postal Services Headquarters SHORT TERM PREDICTIONS A MONITORING SYSTEM by Michiel Helder and Marielle C.T.A Geurs Hoofdkanoor PTT Pos / Duch Posal Services Headquarers Keywords macro ime series shor erm predicions ARIMA-models faciliy

More information

Outline. EECS Components and Design Techniques for Digital Systems. Lec 06 Using FSMs Review: Typical Controller: state

Outline. EECS Components and Design Techniques for Digital Systems. Lec 06 Using FSMs Review: Typical Controller: state Ouline EECS 5 - Componens and Design Techniques for Digial Sysems Lec 6 Using FSMs 9-3-7 Review FSMs Mapping o FPGAs Typical uses of FSMs Synchronous Seq. Circuis safe composiion Timing FSMs in verilog

More information

Dimmer time switch AlphaLux³ D / 27

Dimmer time switch AlphaLux³ D / 27 Dimmer ime swich AlphaLux³ D2 426 26 / 27! Safey noes This produc should be insalled in line wih insallaion rules, preferably by a qualified elecrician. Incorrec insallaion and use can lead o risk of elecric

More information

CENG 477 Introduction to Computer Graphics. Modeling Transformations

CENG 477 Introduction to Computer Graphics. Modeling Transformations CENG 477 Inroducion o Compuer Graphics Modeling Transformaions Modeling Transformaions Model coordinaes o World coordinaes: Model coordinaes: All shapes wih heir local coordinaes and sies. world World

More information

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling

Quick Verification of Concurrent Programs by Iteratively Relaxed Scheduling Quick Verificaion of Concurren Programs by Ieraively Relaxed Scheduling Parick Mezler, Habib Saissi, Péer Bokor, Neeraj Suri Technische Univerisä Darmsad, Germany {mezler, saissi, pbokor, suri}@deeds.informaik.u-darmsad.de

More information

Visual Indoor Localization with a Floor-Plan Map

Visual Indoor Localization with a Floor-Plan Map Visual Indoor Localizaion wih a Floor-Plan Map Hang Chu Dep. of ECE Cornell Universiy Ihaca, NY 14850 hc772@cornell.edu Absrac In his repor, a indoor localizaion mehod is presened. The mehod akes firsperson

More information

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008

MATH Differential Equations September 15, 2008 Project 1, Fall 2008 Due: September 24, 2008 MATH 5 - Differenial Equaions Sepember 15, 8 Projec 1, Fall 8 Due: Sepember 4, 8 Lab 1.3 - Logisics Populaion Models wih Harvesing For his projec we consider lab 1.3 of Differenial Equaions pages 146 o

More information

An efficient approach to improve throughput for TCP vegas in ad hoc network

An efficient approach to improve throughput for TCP vegas in ad hoc network Inernaional Research Journal of Engineering and Technology (IRJET) e-issn: 395-0056 Volume: 0 Issue: 03 June-05 www.irje.ne p-issn: 395-007 An efficien approach o improve hroughpu for TCP vegas in ad hoc

More information

A Tool for Multi-Hour ATM Network Design considering Mixed Peer-to-Peer and Client-Server based Services

A Tool for Multi-Hour ATM Network Design considering Mixed Peer-to-Peer and Client-Server based Services A Tool for Muli-Hour ATM Nework Design considering Mied Peer-o-Peer and Clien-Server based Services Conac Auhor Name: Luis Cardoso Company / Organizaion: Porugal Telecom Inovação Complee Mailing Address:

More information

Motor Control. 5. Control. Motor Control. Motor Control

Motor Control. 5. Control. Motor Control. Motor Control 5. Conrol In his chaper we will do: Feedback Conrol On/Off Conroller PID Conroller Moor Conrol Why use conrol a all? Correc or wrong? Supplying a cerain volage / pulsewidh will make he moor spin a a cerain

More information

USBFC (USB Function Controller)

USBFC (USB Function Controller) USBFC () EIFUFAL501 User s Manual Doc #: 88-02-E01 Revision: 2.0 Dae: 03/24/98 (USBFC) 1. Highlighs... 4 1.1 Feaures... 4 1.2 Overview... 4 1.3 USBFC Block Diagram... 5 1.4 USBFC Typical Sysem Block Diagram...

More information

Design Alternatives for a Thin Lens Spatial Integrator Array

Design Alternatives for a Thin Lens Spatial Integrator Array Egyp. J. Solids, Vol. (7), No. (), (004) 75 Design Alernaives for a Thin Lens Spaial Inegraor Array Hala Kamal *, Daniel V azquez and Javier Alda and E. Bernabeu Opics Deparmen. Universiy Compluense of

More information

The Difference-bit Cache*

The Difference-bit Cache* The Difference-bi Cache* Toni Juan, Tomas Lang~ and Juan J. Navarro Deparmen of Compuer Archiecure Deparmen of Elecrical and Universia Poli&cnica de Caalunya Compuer Engineering Gran CapiiJ s/n, Modul

More information

4.1 3D GEOMETRIC TRANSFORMATIONS

4.1 3D GEOMETRIC TRANSFORMATIONS MODULE IV MCA - 3 COMPUTER GRAPHICS ADMN 29- Dep. of Compuer Science And Applicaions, SJCET, Palai 94 4. 3D GEOMETRIC TRANSFORMATIONS Mehods for geomeric ransformaions and objec modeling in hree dimensions

More information

A Progressive-ILP Based Routing Algorithm for Cross-Referencing Biochips

A Progressive-ILP Based Routing Algorithm for Cross-Referencing Biochips 16.3 A Progressive-ILP Based Rouing Algorihm for Cross-Referencing Biochips Ping-Hung Yuh 1, Sachin Sapanekar 2, Chia-Lin Yang 1, Yao-Wen Chang 3 1 Deparmen of Compuer Science and Informaion Engineering,

More information

A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER

A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER A GRAPHICS PROCESSING UNIT IMPLEMENTATION OF THE PARTICLE FILTER Gusaf Hendeby, Jeroen D. Hol, Rickard Karlsson, Fredrik Gusafsson Deparmen of Elecrical Engineering Auomaic Conrol Linköping Universiy,

More information

A Numerical Study on Impact Damage Assessment of PC Box Girder Bridge by Pounding Effect

A Numerical Study on Impact Damage Assessment of PC Box Girder Bridge by Pounding Effect A Numerical Sudy on Impac Damage Assessmen of PC Box Girder Bridge by Pounding Effec H. Tamai, Y. Sonoda, K. Goou and Y.Kajia Kyushu Universiy, Japan Absrac When a large earhquake occurs, displacemen response

More information

MOBILE COMPUTING 3/18/18. Wi-Fi IEEE. CSE 40814/60814 Spring 2018

MOBILE COMPUTING 3/18/18. Wi-Fi IEEE. CSE 40814/60814 Spring 2018 MOBILE COMPUTING CSE 40814/60814 Spring 2018 Wi-Fi Wi-Fi: name is NOT an abbreviaion play on Hi-Fi (high fideliy) Wireless Local Area Nework (WLAN) echnology WLAN and Wi-Fi ofen used synonymous Typically

More information

MOBILE COMPUTING. Wi-Fi 9/20/15. CSE 40814/60814 Fall Wi-Fi:

MOBILE COMPUTING. Wi-Fi 9/20/15. CSE 40814/60814 Fall Wi-Fi: MOBILE COMPUTING CSE 40814/60814 Fall 2015 Wi-Fi Wi-Fi: name is NOT an abbreviaion play on Hi-Fi (high fideliy) Wireless Local Area Nework (WLAN) echnology WLAN and Wi-Fi ofen used synonymous Typically

More information

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems

Why not experiment with the system itself? Ways to study a system System. Application areas. Different kinds of systems Simulaion Wha is simulaion? Simple synonym: imiaion We are ineresed in sudying a Insead of experimening wih he iself we experimen wih a model of he Experimen wih he Acual Ways o sudy a Sysem Experimen

More information

IntentSearch:Capturing User Intention for One-Click Internet Image Search

IntentSearch:Capturing User Intention for One-Click Internet Image Search JOURNAL OF L A T E X CLASS FILES, VOL. 6, NO. 1, JANUARY 2010 1 InenSearch:Capuring User Inenion for One-Click Inerne Image Search Xiaoou Tang, Fellow, IEEE, Ke Liu, Jingyu Cui, Suden Member, IEEE, Fang

More information

Who thinks who knows who? Socio-Cognitive Analysis of an Network

Who thinks who knows who? Socio-Cognitive Analysis of an  Network Who hinks who knows who? Socio-Cogniive Analysis of an Email Nework Nishih Pahak Deparmen of Compuer Science Universiy of Minnesoa Minneapolis, MN, USA npahak@cs.umn.edu Sandeep Mane Deparmen of Compuer

More information

Announcements. TCP Congestion Control. Goals of Today s Lecture. State Diagrams. TCP State Diagram

Announcements. TCP Congestion Control. Goals of Today s Lecture. State Diagrams. TCP State Diagram nnouncemens TCP Congesion Conrol Projec #3 should be ou onigh Can do individual or in a eam of 2 people Firs phase due November 16 - no slip days Exercise good (beer) ime managemen EE 122: Inro o Communicaion

More information

Improved TLD Algorithm for Face Tracking

Improved TLD Algorithm for Face Tracking Absrac Improved TLD Algorihm for Face Tracking Huimin Li a, Chaojing Yu b and Jing Chen c Chongqing Universiy of Poss and Telecommunicaions, Chongqing 400065, China a li.huimin666@163.com, b 15023299065@163.com,

More information

Packet Scheduling in a Low-Latency Optical Interconnect with Electronic Buffers

Packet Scheduling in a Low-Latency Optical Interconnect with Electronic Buffers Packe cheduling in a Low-Laency Opical Inerconnec wih Elecronic Buffers Lin Liu Zhenghao Zhang Yuanyuan Yang Dep Elecrical & Compuer Engineering Compuer cience Deparmen Dep Elecrical & Compuer Engineering

More information

Protecting User Privacy in a Multi-Path Information-Centric Network Using Multiple Random-Caches

Protecting User Privacy in a Multi-Path Information-Centric Network Using Multiple Random-Caches Chu WB, Wang LF, Jiang ZJ e al. Proecing user privacy in a muli-pah informaion-cenric nework using muliple random-caches. JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 32(3): 585 598 May 27. DOI.7/s39-7-73-2

More information

Reinforcement Learning by Policy Improvement. Making Use of Experiences of The Other Tasks. Hajime Kimura and Shigenobu Kobayashi

Reinforcement Learning by Policy Improvement. Making Use of Experiences of The Other Tasks. Hajime Kimura and Shigenobu Kobayashi Reinforcemen Learning by Policy Improvemen Making Use of Experiences of The Oher Tasks Hajime Kimura and Shigenobu Kobayashi Tokyo Insiue of Technology, JAPAN genfe.dis.iech.ac.jp, kobayasidis.iech.ac.jp

More information

SEINA: A Stealthy and Effective Internal Attack in Hadoop Systems

SEINA: A Stealthy and Effective Internal Attack in Hadoop Systems SEINA: A Sealhy and Effecive Inernal Aack in Hadoop Sysems Jiayin Wang, Teng Wang, Zhengyu Yang, Ying ao, Ningfang i, and Bo Sheng Deparmen of Compuer Science, Universiy of assachuses Boson, 1 orrissey

More information

Low-Cost WLAN based. Dr. Christian Hoene. Computer Science Department, University of Tübingen, Germany

Low-Cost WLAN based. Dr. Christian Hoene. Computer Science Department, University of Tübingen, Germany Low-Cos WLAN based Time-of-fligh fligh Trilaeraion Precision Indoor Personnel Locaion and Tracking for Emergency Responders Third Annual Technology Workshop, Augus 5, 2008 Worceser Polyechnic Insiue, Worceser,

More information

Rule-Based Multi-Query Optimization

Rule-Based Multi-Query Optimization Rule-Based Muli-Query Opimizaion Mingsheng Hong Dep. of Compuer cience Cornell Universiy mshong@cs.cornell.edu Johannes Gehrke Dep. of Compuer cience Cornell Universiy johannes@cs.cornell.edu Mirek Riedewald

More information

Who Thinks Who Knows Who? Socio-cognitive Analysis of Networks. Technical Report

Who Thinks Who Knows Who? Socio-cognitive Analysis of  Networks. Technical Report Who Thinks Who Knows Who? Socio-cogniive Analysis of Email Neworks Technical Repor Deparmen of Compuer Science and Engineering Universiy of Minnesoa 4-192 EECS Building 200 Union Sree SE Minneapolis, MN

More information

M(t)/M/1 Queueing System with Sinusoidal Arrival Rate

M(t)/M/1 Queueing System with Sinusoidal Arrival Rate 20 TUTA/IOE/PCU Journal of he Insiue of Engineering, 205, (): 20-27 TUTA/IOE/PCU Prined in Nepal M()/M/ Queueing Sysem wih Sinusoidal Arrival Rae A.P. Pan, R.P. Ghimire 2 Deparmen of Mahemaics, Tri-Chandra

More information

Time Expression Recognition Using a Constituent-based Tagging Scheme

Time Expression Recognition Using a Constituent-based Tagging Scheme Track: Web Conen Analysis, Semanics and Knowledge Time Expression Recogniion Using a Consiuen-based Tagging Scheme Xiaoshi Zhong and Erik Cambria School of Compuer Science and Engineering Nanyang Technological

More information

A Web Browsing Traffic Model for Simulation: Measurement and Analysis

A Web Browsing Traffic Model for Simulation: Measurement and Analysis A Web Browsing Traffic Model for Simulaion: Measuremen and Analysis Lourens O. Walers Daa Neworks Archiecure Group Universiy of Cape Town Privae Bag, Rondebosch, 7701 Tel: (021) 650 2663, Fax: (021) 689

More information

Chapter 3 MEDIA ACCESS CONTROL

Chapter 3 MEDIA ACCESS CONTROL Chaper 3 MEDIA ACCESS CONTROL Overview Moivaion SDMA, FDMA, TDMA Aloha Adapive Aloha Backoff proocols Reservaion schemes Polling Disribued Compuing Group Mobile Compuing Summer 2003 Disribued Compuing

More information

Delay in Packet Switched Networks

Delay in Packet Switched Networks 1 Delay in Packe Swiched Neworks Required reading: Kurose 1.5 and 1.6 CSE 4213, Fall 2006 Insrucor: N. Vlajic Delay in Packe-Swiched Neworks 2 Link/Nework Performance Measures: hroughpu and delay Link

More information

1. Function 1. Push-button interface 4g.plus. Push-button interface 4-gang plus. 2. Installation. Table of Contents

1. Function 1. Push-button interface 4g.plus. Push-button interface 4-gang plus. 2. Installation. Table of Contents Chaper 4: Binary inpus 4.6 Push-buon inerfaces Push-buon inerface Ar. no. 6708xx Push-buon inerface 2-gang plus Push-buon inerfacechaper 4:Binary inpusar. no.6708xxversion 08/054.6Push-buon inerfaces.

More information

EVALUATING ACCURACY OF A TIME ESTIMATOR IN A PROJECT

EVALUATING ACCURACY OF A TIME ESTIMATOR IN A PROJECT EVALUATING ACCURACY OF A TIME ESTIMATOR IN A PROJECT Thanh-Lam Nguyen, Graduae Insiue of Mechanical and Precision Engineering Wei-Ju Hung, Deparmen of Indusrial Engineering and Managemen Ming-Hung Shu,

More information

Nonparametric CUSUM Charts for Process Variability

Nonparametric CUSUM Charts for Process Variability Journal of Academia and Indusrial Research (JAIR) Volume 3, Issue June 4 53 REEARCH ARTICLE IN: 78-53 Nonparameric CUUM Chars for Process Variabiliy D.M. Zombade and V.B. Ghue * Dep. of aisics, Walchand

More information

Video Content Description Using Fuzzy Spatio-Temporal Relations

Video Content Description Using Fuzzy Spatio-Temporal Relations Proceedings of he 4s Hawaii Inernaional Conference on Sysem Sciences - 008 Video Conen Descripion Using Fuzzy Spaio-Temporal Relaions rchana M. Rajurkar *, R.C. Joshi and Sananu Chaudhary 3 Dep of Compuer

More information

Adaptive VM Management with Two Phase Power Consumption Cost Models in Cloud Datacenter

Adaptive VM Management with Two Phase Power Consumption Cost Models in Cloud Datacenter Mobile New Appl (2016) 21:793 805 DOI 10.1007/s11036-016-0690-z Adapive VM Managemen wih Two Phase Power Consumpion Cos Models in Cloud Daacener Dong-Ki Kang 1 & Fawaz Al-Hazemi 1 & Seong-Hwan Kim 1 &

More information

STRING DESCRIPTIONS OF DATA FOR DISPLAY*

STRING DESCRIPTIONS OF DATA FOR DISPLAY* SLAC-PUB-383 January 1968 STRING DESCRIPTIONS OF DATA FOR DISPLAY* J. E. George and W. F. Miller Compuer Science Deparmen and Sanford Linear Acceleraor Cener Sanford Universiy Sanford, California Absrac

More information

Video streaming over Vajda Tamás

Video streaming over Vajda Tamás Video sreaming over 802.11 Vajda Tamás Video No all bis are creaed equal Group of Picures (GoP) Video Sequence Slice Macroblock Picure (Frame) Inra (I) frames, Prediced (P) Frames or Bidirecional (B) Frames.

More information

Improving Explicit Congestion Notification with the Mark-Front Strategy

Improving Explicit Congestion Notification with the Mark-Front Strategy Improving Explici Congesion Noificaion wih he Mark-Fron Sraegy Chunlei Liu Raj Jain Deparmen of Compuer and Informaion Science Chief Technology Officer, Nayna Neworks, Inc. The Ohio Sae Universiy, Columbus,

More information

NRMI: Natural and Efficient Middleware

NRMI: Natural and Efficient Middleware NRMI: Naural and Efficien Middleware Eli Tilevich and Yannis Smaragdakis Cener for Experimenal Research in Compuer Sysems (CERCS), College of Compuing, Georgia Tech {ilevich, yannis}@cc.gaech.edu Absrac

More information

NEWTON S SECOND LAW OF MOTION

NEWTON S SECOND LAW OF MOTION Course and Secion Dae Names NEWTON S SECOND LAW OF MOTION The acceleraion of an objec is defined as he rae of change of elociy. If he elociy changes by an amoun in a ime, hen he aerage acceleraion during

More information

An HTTP Web Traffic Model Based on the Top One Million Visited Web Pages

An HTTP Web Traffic Model Based on the Top One Million Visited Web Pages An HTTP Web Traffic Model Based on he Top One Million Visied Web Pages Rasin Pries, Zsol Magyari, Phuoc Tran-Gia Universiy of Würzburg, Insiue of Compuer Science, Germany Email: {pries,rangia}@informaik.uni-wuerzburg.de

More information

MoBAN: A Configurable Mobility Model for Wireless Body Area Networks

MoBAN: A Configurable Mobility Model for Wireless Body Area Networks MoBAN: A Configurable Mobiliy Model for Wireless Body Area Neworks Majid Nabi 1, Marc Geilen 1, Twan Basen 1,2 1 Deparmen of Elecrical Engineering, Eindhoven Universiy of Technology, he Neherlands 2 Embedded

More information

Autonomic Cognitive-based Data Dissemination in Opportunistic Networks

Autonomic Cognitive-based Data Dissemination in Opportunistic Networks Auonomic Cogniive-based Daa Disseminaion in Opporunisic Neworks Lorenzo Valerio, Marco Coni, Elena Pagani and Andrea Passarella IIT-CNR, Pisa, Ialy Email: {marco.coni,andrea.passarella,lorenzo.valerio}@ii.cnr.i

More information

In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magnetic Field Maps

In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magnetic Field Maps In fmri a Dual Echo Time EPI Pulse Sequence Can Induce Sources of Error in Dynamic Magneic Field Maps A. D. Hahn 1, A. S. Nencka 1 and D. B. Rowe 2,1 1 Medical College of Wisconsin, Milwaukee, WI, Unied

More information

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL

CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL CAMERA CALIBRATION BY REGISTRATION STEREO RECONSTRUCTION TO 3D MODEL Klečka Jan Docoral Degree Programme (1), FEEC BUT E-mail: xkleck01@sud.feec.vubr.cz Supervised by: Horák Karel E-mail: horak@feec.vubr.cz

More information

Performance and Availability Assessment for the Configuration of Distributed Workflow Management Systems

Performance and Availability Assessment for the Configuration of Distributed Workflow Management Systems Absrac Performance and Availabiliy Assessmen for he Configuraion of Disribued Workflow Managemen Sysems Michael Gillmann 1, Jeanine Weissenfels 1, Gerhard Weikum 1, Achim Kraiss 2 1 Universiy of he Saarland,

More information