Just-In-Time Software Pipelining
|
|
- Randolph Warren
- 6 years ago
- Views:
Transcription
1 Just-In-Time Software Pipelining Hongbo Rong Hyunchul Park Youfeng Wu Cheng Wang Programming Systems Lab Intel Labs, Santa Clara
2 What is software pipelining? A loop optimization exposing instruction-level parallelism (ILP) for (i = 0; i<n; i++){ } a: x = y + 1 b: y = A[i] + x c: B[i+2] = B[i]*x : A[i+2] = B[i+2] 2
3 What is software pipelining? A loop optimization exposing instruction-level parallelism (ILP) for (i = 0; i<n; i++){ a <1> a: x = y + 1 b: y = A[i] + x <1> c b c: B[i+2] = B[i]*x : A[i+2] = B[i+2] <2> } Local epenence Loop-carrie epenence 2
4 <1> c a <1> b <2> 3
5 Iteration <1> c a <1> b <2> 3
6 a b c Iteration a <1> <1> c b <2> 3
7 a Iteration b c a b c a <1> <1> c b <2> 3
8 a b c a b c Iteration Initiation interval(ii)=2 a <1> <1> c b <2> 3
9 a b c a b c Iteration Initiation interval(ii)=2 a <1> a b c <1> c b <2> 3
10 a b c a b c Iteration Initiation interval(ii)=2 a <1> a b c a <1> c b <2> b c 3
11 a b c a b c Iteration Initiation interval(ii)=2 a <1> a b c a <1> c b <2> b c 3
12 a b c a b c a b c a b c Iteration Initiation interval(ii)=2 Kernel 2 stages Different iterations work on ifferent stages in parallel 3
13 a b c a b c a b c a b c Iteration Initiation interval(ii)=2 Kernel 2 stages Different iterations work on ifferent stages in parallel 3
14 a b c a b c a b c a b c Iteration Initiation interval(ii)=2 Kernel 2 stages Different iterations work on ifferent stages in parallel 3
15 a b c a b c a b c a b c Iteration Initiation interval(ii)=2 Kernel 2 stages Different iterations work on ifferent stages in parallel II is the performance inicator 3
16 Software pipelining has been static Extensively stuie in 3 ecaes, an efficient for wie-issue architectures VLIW [Lam 1988] superscalar [Ruttenberg et al. 1996] 4
17 Software pipelining has been static Extensively stuie in 3 ecaes, an efficient for wie-issue architectures VLIW [Lam 1988] superscalar [Ruttenberg et al. 1996] It is seen only in static compilers 4
18 Software pipelining has been static Extensively stuie in 3 ecaes, an efficient for wie-issue architectures VLIW [Lam 1988] superscalar [Ruttenberg et al. 1996] It is seen only in static compilers Most works aim to minimize II but not compile overhea 4
19 It is time now to exten software pipelining to ynamic compilers! 5
20 It is time now to exten software pipelining to ynamic compilers! Dynamic languages are increasingly popular JavaScript an PhP 88.9% an 81.5% in client an server websites (W3Techs) 5
21 It is time now to exten software pipelining to ynamic compilers! Dynamic languages are increasingly popular JavaScript an PhP 88.9% an 81.5% in client an server websites (W3Techs) Huge amount of legacy coe Small optimization scope: a loop iteration Software pipelining enlarges the scope to many iterations 5
22 It is time now to exten software pipelining to ynamic compilers! Dynamic languages are increasingly popular JavaScript an PhP 88.9% an 81.5% in client an server websites (W3Techs) Huge amount of legacy coe Small optimization scope: a loop iteration Software pipelining enlarges the scope to many iterations Minimizing compile overhea must be the 1 st objective Only simple/fast algorithms can be use 5
23 It is time now to exten software pipelining to ynamic compilers! Dynamic languages are increasingly popular JavaScript an PhP 88.9% an 81.5% in client an server websites (W3Techs) Huge amount of legacy coe Small optimization scope: a loop iteration Software pipelining enlarges the scope to many iterations Minimizing compile overhea must be the 1 st objective Only simple/fast algorithms can be use linear-time algorithms are preferre 5
24 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] 6
25 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] for (i = 0; i<n; i++){ a b c } 6
26 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] for (i = 0; i<n; i++){ Original a optimizat b ion scope c } 6
27 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] for (i = 0; i<n; i++){ Original a optimizat b ion scope c } 6 for (j = 0; j<n; j+=m){ for (i = j; i<j+m; i++){ a b c } }
28 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] for (i = 0; i<n; i++){ Original a optimizat b ion scope c } Atomic region 6 for (j = 0; j<n; j+=m){ for (i = j; i<j+m; i++){ a } } b c
29 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] Costly rollback Software: Light-weight checkpointing for (i = 0; i<n; i++){ Original a optimizat b ion scope c } Atomic region 6 for (j = 0; j<n; j+=m){ for (i = j; i<j+m; i++){ a } } b c
30 Challenges Memory aliases kill parallelism Harware: Atomic region + rotating alias registers [MICRO-46] Costly rollback Software: Light-weight checkpointing Scheuling is expensive for (i = 0; i<n; i++){ Original a optimizat b ion scope c } Atomic region 6 for (j = 0; j<n; j+=m){ for (i = j; i<j+m; i++){ a } } b c
31 7 Framework (on Transmeta CMS)
32 x86 binary Framework (on Transmeta CMS) 7
33 x86 binary Framework (on Transmeta CMS) Interpreter & profiler 7
34 x86 binary Interpreter & profiler Framework (on Transmeta CMS) Hot region optimizations 7
35 x86 binary Interpreter & profiler Framework (on Transmeta CMS) Hot region optimizations Acyclic scheuler 7
36 x86 binary Interpreter & profiler Framework (on Transmeta CMS) Hot region optimizations Acyclic scheuler Assembler 7
37 x86 binary Interpreter & profiler Framework (on Transmeta CMS) Hot region optimizations Acyclic scheuler Coe cache Assembler 7
38 x86 binary Interpreter & profiler Framework (on Transmeta CMS) Hot region optimizations Acyclic scheuler Coe cache Assembler 8
39 Framework (on Transmeta CMS) Hot region optimizations Acyclic scheuler Assembler 8
40 Framework (on Transmeta CMS) Hot region optimizations Loop selection Acyclic scheuler Acyclic scheuler Assembler 8
41 Framework (on Transmeta CMS) Hot region optimizations Initialization Loop selection Acyclic scheuler Acyclic scheuler Assembler 8
42 Framework (on Transmeta CMS) Hot region optimizations Initialization Loop selection Scheuling Acyclic scheuler Acyclic scheuler Assembler 8
43 Framework (on Transmeta CMS) Hot region optimizations Initialization Loop selection Acyclic scheuler Acyclic scheuler Scheuling Rotating alias reg alloc Assembler 8
44 Framework (on Transmeta CMS) Hot region optimizations Initialization Loop selection Acyclic scheuler Acyclic scheuler Scheuling Rotating alias reg alloc Assembler Coe generation 8
45 Framework (on Transmeta CMS) Hot region optimizations Initialization Loop selection Acyclic scheuler Acyclic scheuler Scheuling Rotating alias reg alloc Assembler Coe generation 8
46 Framework (on Transmeta CMS) Hot region optimizations Initialization Loop selection Acyclic scheuler Acyclic scheuler Scheuling Rotating alias reg alloc Assembler Atomic region Coe generation Rotating alias register file 8
47 Scheuling is expensive NP-complete problem to fin an optimal scheule [Collan et al. 1996] 9
48 Scheuling is expensive NP-complete problem to fin an optimal scheule [Collan et al. 1996] O(V 3 ) at least, exponential at worst [Rau et al. 1992] V: number of operations 9
49 Scheuling is expensive NP-complete problem to fin an optimal scheule [Collan et al. 1996] O(V 3 ) at least, exponential at worst [Rau et al. 1992] V: number of operations Can we linearize software pipelining? 9
50 Keep scheuling time uner control 10
51 Keep scheuling time uner control Scheule in linear time Use either simple or fast sub-algorithms Avoi cubic or exponential complexity 10
52 Keep scheuling time uner control Scheule in linear time Use either simple or fast sub-algorithms Avoi cubic or exponential complexity For iterative sub-algorithms, have a threshol: the maximum #iterations allowe 10
53 Keep scheuling time uner control Scheule in linear time Use either simple or fast sub-algorithms Avoi cubic or exponential complexity For iterative sub-algorithms, have a threshol: the maximum #iterations allowe The smaller the threshol, the less the compile overhea Once exceee, abort software pipelining 10
54 Keep scheuling time uner control Scheule in linear time Use either simple or fast sub-algorithms Avoi cubic or exponential complexity For iterative sub-algorithms, have a threshol: the maximum #iterations allowe The smaller the threshol, the less the compile overhea Once exceee, abort software pipelining Key question: How small can the threshol be? 10
55 Keep scheuling time uner control Scheule in linear time Use either simple or fast sub-algorithms Avoi cubic or exponential complexity For iterative sub-algorithms, have a threshol: the maximum #iterations allowe The smaller the threshol, the less the compile overhea Once exceee, abort software pipelining Key question: How small can the threshol be? Fin a goo enough scheule No backtracking Priority function: approximate an never upate Separate epenence an resource constraints Separate local an loop-carrie epenences 10
56 Keep scheuling time uner control Scheule in linear time Use either simple or fast sub-algorithms Avoi cubic or exponential complexity For iterative sub-algorithms, have a threshol: the maximum #iterations allowe The smaller the threshol, the less the compile overhea Once exceee, abort software pipelining Key question: How small can the threshol be? Fin a goo enough scheule No backtracking Priority function: approximate an never upate Separate epenence an resource constraints Separate local an loop-carrie epenences Iteratively improve a scheule 10
57 Just-In-Time Software Pipelining 11
58 Just-In-Time Software Pipelining Prepartition H, B Quickly creates an initial scheule to start with 11
59 Just-In-Time Software Pipelining Prepartition H, B Local scheuling Quickly creates an initial scheule to start with Hanles local epenences an resources 11
60 Just-In-Time Software Pipelining Prepartition H, B Local scheuling Kernel expansion Quickly creates an initial scheule to start with Hanles local epenences an resources Ajusts the scheule for loop-carrie epenences 11
61 Just-In-Time Software Pipelining Prepartition Local scheuling Kernel expansion Exit H, B Quickly creates an initial scheule to start with Hanles local epenences an resources Ajusts the scheule for loop-carrie epenences 11
62 Just-In-Time Software Pipelining Prepartition Local scheuling Kernel expansion Exit H, B Rotation 3 Quickly creates an initial scheule to start with Hanles local epenences an resources Ajusts the scheule for loop-carrie epenences Iteratively improves the scheule 11
63 Just-In-Time Software Pipelining Prepartition Local scheuling Kernel expansion Exit H, B Rotation 3 Time complexity: O(V+E) V: #operations E: #epenences 11 Quickly creates an initial scheule to start with Hanles local epenences an resources Ajusts the scheule for loop-carrie epenences Iteratively improves the scheule
64 Prepartition Iteration Local scheuling Kernel expansion Exit Rotation Local epenences Resources Loop-carrie epenences Illustration 12
65 Prepartition Local scheuling Kernel expansion abc RecMII Iteration Exit Rotation Local epenences Resources Loop-carrie epenences Illustration 12
66 Prepartition Local scheuling abc RecMII Iteration Kernel expansion abc Kernel Exit Rotation abc Local epenences Resources Loop-carrie epenences abc Illustration 12
67 Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho 13
68 Conventional Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho O(V 3 ) 13
69 Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho Conventional O(V 3 ) Turn the problem into a Markov ecision process 1 st time Howar algorithm is applie to pipelining 13
70 Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho Conventional O(V 3 ) Howar policy iteration algo. O(exponential*E) Turn the problem into a Markov ecision process 1 st time Howar algorithm is applie to pipelining 13
71 Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho Conventional Howar policy iteration algo. Linearize Howar O(V 3 ) O(exponential*E) O( H* E) Turn the problem into a Markov ecision process 1 st time Howar algorithm is applie to pipelining Linearize Howar with a small constant H 13
72 Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho Conventional O(V 3 ) Howar policy iteration algo. O(exponential*E) Linearize Howar O( E) Turn the problem into a Markov ecision process 1 st time Howar algorithm is applie to pipelining Linearize Howar with a small constant H 13
73 Calculating RecMII Recurrence Minimum II II etermine by the biggest epenence cycles Neee by almost every software pipelining metho Conventional O(V 3 ) Howar policy iteration algo. O(exponential*E) Linearize Howar O( E) Turn the problem into a Markov ecision process 1 st time Howar algorithm is applie to pipelining Linearize Howar with a small constant H A by-prouct: critical 13 operations
74 Divie operations into stages Bellman-For O(V*E) 14
75 Divie operations into stages Bellman-For O(V*E) Also an iterative sub-algorithm 14
76 Divie operations into stages Bellman-For Linearize Bellman-For O(V*E) O( E) B* Also an iterative sub-algorithm Linearize it with a constant B 14
77 Divie operations into stages Bellman-For O(V*E) Linearize Bellman-For O( E) B* Also an iterative sub-algorithm Linearize it with a constant B Specific orer in visiting eges Scan noes in sequential orer, an visit their incoming eges Values are propagate along local eges in the 1 st itr. Values are propagate along loop-carrie eges in the 2 n itr. 14
78 Divie operations into stages Bellman-For O(V*E) Linearize Bellman-For O( E) Also an iterative sub-algorithm Linearize it with a constant B Specific orer in visiting eges Scan noes in sequential orer, an visit their incoming eges Values are propagate along local eges in the 1 st itr. Values are propagate along loop-carrie eges in the 2 n itr. 14
79 Prepartition Local scheuling abc RecMII Iteration Kernel expansion abc Kernel Exit Rotation abc Local epenences Resources Loop-carrie epenences abc Illustration 15
80 Prepartition Local scheuling abc RecMII Iteration Kernel expansion abc Kernel Exit Rotation abc Local epenences Resources Loop-carrie epenences abc Illustration 15
81 Prepartition Local scheuling abc II Iteration Kernel expansion abc Kernel Exit Rotation Local epenences Resources Loop-carrie epenences Illustration (Cont.) 16 abc abc
82 X X Prepartition Local scheuling Kernel expansion Exit Rotation Local epenences Resources a Illustration (Cont.) b c a Loop-carrie epenences 16 b c II a b c Kernel a b c Iteration
83 Local scheuling Any local scheuling algorithm can be use E.g. list scheuling 17
84 Local scheuling Any local scheuling algorithm can be use E.g. list scheuling Weakness: Loop-carrie epenences may be violate 17
85 Local scheuling Any local scheuling algorithm can be use E.g. list scheuling Weakness: Loop-carrie epenences may be violate To reuce the chance of violation: Before scheuling, priority function consiers loopcarrie epenences in avance Prioritize critical operations 17
86 Prepartition Local scheuling Kernel expansion Exit Rotation Iteration a b c a b c II a Kernel X X Local epenences Resources Loop-carrie epenences Illustration (Cont.) 18 b c a b c
87 Prepartition Local scheuling Kernel expansion Exit Rotation Iteration a b c a b c II a Kernel X X Local epenences Resources Loop-carrie epenences Illustration (Cont.) 18 b c a b c
88 x x Prepartition Local scheuling Kernel expansion Exit Rotation Local epenences Resources a b c Illustration (Cont.) 19 Loop-carrie epenences a b c II a b c a b c Iteration
89 Prepartition Local scheuling Kernel expansion x x x Exit Rotation Local epenences Resources Iteration a b c Loop-carrie epenences Illustration (Cont.) 19 II a b c Kernel a b c a b c
90 Prepartition Local scheuling Kernel expansion x x x Exit Rotation Local epenences Resources Loop-carrie epenences Illustration (Cont.) Iteration a b c a b c a b c 20 a b c
91 Prepartition Local scheuling Kernel expansion Exit Rotation Local epenences Resources a b c Loop-carrie epenences Illustration (Cont.) 20 Iteration a b c a Kernel b c a b c
92 Experiments Transmeta CMS on SPEC2k traces 21
93 Experiments Transmeta CMS on SPEC2k traces Functional simulator to comprehensively Explore threshols H an B Evaluate compile overhea an scheules quality 21
94 Experiments Transmeta CMS on SPEC2k traces Functional simulator to comprehensively Explore threshols H an B Evaluate compile overhea an scheules quality Cycle-accurate simulator Simulates cache misses, latencies, Initial performance stuy 21
95 Linearize Howar vs. exponential backoff + binary search [Rau et al. 1992] 22
96 1000 Linearize Howar vs. exponential backoff + binary search [Rau et al. 1992] #epenences
97 1000 Linearize Howar vs. exponential backoff + binary search [Rau et al. 1992] #epenences
98 Linearize Howar vs. exponential backoff + binary 1000 search [Rau et al. 1992] Average 9X faster Peak 812X faster H 3 3 for 96% of 11,992 loops #epenences
99 1000 Linearize Howar vs. exponential backoff + binary search [Rau et al. 1992] Average 9X faster Peak 812X faster H 3 3 for 96% of 11,992 loops H 14 for all loops 22 #epenences
100 Linearize Bellman-For 23
101 Linearize Bellman-For 100% % Total loops 50% 0% B 23
102 Linearize Bellman-For 100% % Total loops 50% 0% B B 3 for 98.8% of 11,992 loops 23
103 Linearize Bellman-For 100% % Total loops 50% 0% B B 3 for 98.8% of 11,992 loops 23
104 Linearize Bellman-For 100% % Total loops 50% 0% B B 3 for 98.8% of 11,992 loops B 5 for all the loops 23
105 Linearize Bellman-For 100% % Total loops 50% 0% B B 3 for 98.8% of 11,992 loops B 5 for all the loops From now on, we set H=10, B=5 11,910 loops scheule 23
106 Scheuling overhea & scheules quality Overhea % loops with optimal scheules 95% 1X RS2 DESP JITSP RS2 DESP JITSP 24
107 Scheuling overhea & scheules quality Overhea % loops with optimal scheules 95% 1X RS2 DESP JITSP RS2 DESP JITSP JITSP achieves optimal scheules for 95% loops 24
108 Scheuling overhea & scheules quality Overhea % loops with optimal scheules 12X 95% 1X RS2 DESP JITSP RS2 DESP JITSP JITSP achieves optimal scheules for 95% loops 24
109 Scheuling overhea & scheules quality Overhea % loops with optimal scheules 12X 13% 95% 1X RS2 DESP JITSP RS2 DESP JITSP JITSP achieves optimal scheules for 95% loops 24
110 Scheuling overhea & scheules quality Overhea % loops with optimal scheules 12X 13% 95% 25% 1X RS2 DESP JITSP RS2 DESP JITSP JITSP achieves optimal scheules for 95% loops 24
111 Scheuling overhea & scheules quality Overhea % loops with optimal scheules 12X 13% 23% 95% 25% 1X RS2 DESP JITSP RS2 DESP JITSP JITSP achieves optimal scheules for 95% loops 24
112 Compile overhea istribution Acyclic scheuler, 27% Software pipelining, 33% Assembler, 6% Hot region optimizations, 34% Note: acyclic scheuler hanles acyclic coe, or loops NOT selecte for software pipelining 25
113 Preliminary performance 40 hot loops 26
114 Preliminary performance too many unrolls 1% Successfully generate coe 25% too many registers 55% filtere 19% 40 hot loops 26
115 Preliminary performance too many unrolls 1% Successfully generate coe 25% too many registers 55% filtere 19% 40 hot loops 26
116 Preliminary performance too many unrolls 1% Successfully generate coe 25% too many registers 55% filtere 19% 40 hot loops The architecture has bottleneck in registers 2/7/24/32 preicate/static alias/integer/floating point available for pipelining 26
117 Preliminary performance too many unrolls 1% Successfully generate coe 25% too many registers 55% filtere 19% 40 hot loops The architecture has bottleneck in registers 2/7/24/32 preicate/static alias/integer/floating point available for pipelining 26
118 Preliminary performance too many unrolls 1% Successfully generate coe 25% too many registers 55% filtere 19% 40 hot loops The architecture has bottleneck in registers 2/7/24/32 preicate/static alias/integer/floating point available for pipelining 26
119 Preliminary performance (Cont.) 2.00 II/MII (The lower, the better) 1.50 Optimal RS2 DESP JITSP 27
120 Preliminary performance (Cont.) 2.00 II/MII (The lower, the better) 1.50 Optimal RS2 DESP JITSP JITSP achieves optimal scheules for all but 1 loops 27
121 Preliminary performance (Cont.) 40% 20% 0% -20% Speeup RS2 DESP JITSP -40% JITSP 10~36% speeup. Better than the others 28
122 Preliminary performance (Cont.) 40% 20% 0% -20% Speeup RS2 DESP JITSP -40% JITSP 10~36% speeup. Better than the others Exception: loop 2. Optimal scheule but slowown ue to memory aliases retranslation neee 28
123 Preliminary performance (Cont.) 40% Speeup 20% 0% -20% RS2 DESP JITSP -40% JITSP 10~36% speeup. Better than the others Exception: loop 2. Optimal scheule but slowown ue to memory aliases retranslation neee Speeup swim(5.3%), ammp(4.4%), 28 mcf(3.6%)
124 Conclusion 29
125 Conclusion 1st linear software pipelining algorithm implemente for ynamic compilers 29
126 Conclusion 1st linear software pipelining algorithm implemente for ynamic compilers Turns a traitionally-expensive optimization into linear time O(V+E) 29
127 Conclusion 1st linear software pipelining algorithm implemente for ynamic compilers Turns a traitionally-expensive optimization into linear time O(V+E) Taking avantages of results from various omains: harware circuit esign (Retiming) stochastic control (Howar algorithm) Graph (Bellman-For) software pipelining (Rotation scheuling an DESP) 29
128 Conclusion 1st linear software pipelining algorithm implemente for ynamic compilers Turns a traitionally-expensive optimization into linear time O(V+E) Taking avantages of results from various omains: harware circuit esign (Retiming) stochastic control (Howar algorithm) Graph (Bellman-For) software pipelining (Rotation scheuling an DESP) Generates optimal or near-optimal scheules with reasonable compile overhea 29
129 Future work Register availability A more architecture registers Algorithm: Register pressure-aware Implementation Loop selection Re-translation Evaluation Benchmarks 30
130 Backup slies 31
131 Howar algorithm Each operation is reware to reach a cycle via 1 policy ege The bigger the cycle, the more the rewar 32
132 Howar algorithm Each operation is reware to reach a cycle via 1 policy ege The bigger the cycle, the more the rewar a c b 32
133 Howar algorithm Each operation is reware to reach a cycle via 1 policy ege The bigger the cycle, the more the rewar a c b 32
134 Howar algorithm Each operation is reware to reach a cycle via 1 policy ege The bigger the cycle, the more the rewar a c b 32
135 Distribution statistics of JITSP 33
Chapter 9 Memory Management
Contents 1. Introuction 2. Computer-System Structures 3. Operating-System Structures 4. Processes 5. Threas 6. CPU Scheuling 7. Process Synchronization 8. Dealocks 9. Memory Management 10.Virtual Memory
More informationCompiler Optimisation
Compiler Optimisation Michael O Boyle mob@inf.e.ac.uk Room 1.06 January, 2014 1 Two recommene books for the course Recommene texts Engineering a Compiler Engineering a Compiler by K. D. Cooper an L. Torczon.
More informationLoop Scheduling and Partitions for Hiding Memory Latencies
Loop Scheuling an Partitions for Hiing Memory Latencies Fei Chen Ewin Hsing-Mean Sha Dept. of Computer Science an Engineering University of Notre Dame Notre Dame, IN 46556 Email: fchen,esha @cse.n.eu Tel:
More informationCMSC 430 Introduction to Compilers. Spring Register Allocation
CMSC 430 Introuction to Compilers Spring 2016 Register Allocation Introuction Change coe that uses an unoune set of virtual registers to coe that uses a finite set of actual regs For ytecoe targets, can
More informationOverview. Operating Systems I. Simple Memory Management. Simple Memory Management. Multiprocessing w/fixed Partitions.
Overview Operating Systems I Management Provie Services processes files Manage Devices processor memory isk Simple Management One process in memory, using it all each program nees I/O rivers until 96 I/O
More informationYet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama and Hayato Ohwada Faculty of Sci. and Tech. Tokyo University of Scien
Yet Another Parallel Hypothesis Search for Inverse Entailment Hiroyuki Nishiyama an Hayato Ohwaa Faculty of Sci. an Tech. Tokyo University of Science, 2641 Yamazaki, Noa-shi, CHIBA, 278-8510, Japan hiroyuki@rs.noa.tus.ac.jp,
More informationPower-Performance Trade-offs for Energy-Efficient Architectures: A Quantitative Study
In the Proc. of the Intl. Conf. on Computer Design (ICCD-02), Frieburg, March 2002 Power-Performance Trae-offs for Energy-Efficient Architectures: A Quantitative Stuy Hongbo Yang R. Govinarajan Guang R.
More informationRandom Clustering for Multiple Sampling Units to Speed Up Run-time Sample Generation
DEIM Forum 2018 I4-4 Abstract Ranom Clustering for Multiple Sampling Units to Spee Up Run-time Sample Generation uzuru OKAJIMA an Koichi MARUAMA NEC Solution Innovators, Lt. 1-18-7 Shinkiba, Koto-ku, Tokyo,
More informationAlmost Disjunct Codes in Large Scale Multihop Wireless Network Media Access Control
Almost Disjunct Coes in Large Scale Multihop Wireless Network Meia Access Control D. Charles Engelhart Anan Sivasubramaniam Penn. State University University Park PA 682 engelhar,anan @cse.psu.eu Abstract
More informationAn Adaptive Routing Algorithm for Communication Networks using Back Pressure Technique
International OPEN ACCESS Journal Of Moern Engineering Research (IJMER) An Aaptive Routing Algorithm for Communication Networks using Back Pressure Technique Khasimpeera Mohamme 1, K. Kalpana 2 1 M. Tech
More informationAlgorithm for Intermodal Optimal Multidestination Tour with Dynamic Travel Times
Algorithm for Intermoal Optimal Multiestination Tour with Dynamic Travel Times Neema Nassir, Alireza Khani, Mark Hickman, an Hyunsoo Noh This paper presents an efficient algorithm that fins the intermoal
More informationFrequency Domain Parameter Estimation of a Synchronous Generator Using Bi-objective Genetic Algorithms
Proceeings of the 7th WSEAS International Conference on Simulation, Moelling an Optimization, Beijing, China, September 15-17, 2007 429 Frequenc Domain Parameter Estimation of a Snchronous Generator Using
More informationAnyTraffic Labeled Routing
AnyTraffic Labele Routing Dimitri Papaimitriou 1, Pero Peroso 2, Davie Careglio 2 1 Alcatel-Lucent Bell, Antwerp, Belgium Email: imitri.papaimitriou@alcatel-lucent.com 2 Universitat Politècnica e Catalunya,
More informationBaring it all to Software: The Raw Machine
Baring it all to Software: The Raw Machine Elliot Waingol, Michael Taylor, Vivek Sarkar, Walter Lee, Victor Lee, Jang Kim, Matthew Frank, Peter Finch, Srikrishna Devabhaktuni, Rajeev Barua, Jonathan Babb,
More informationAdaptive Load Balancing based on IP Fast Reroute to Avoid Congestion Hot-spots
Aaptive Loa Balancing base on IP Fast Reroute to Avoi Congestion Hot-spots Masaki Hara an Takuya Yoshihiro Faculty of Systems Engineering, Wakayama University 930 Sakaeani, Wakayama, 640-8510, Japan Email:
More informationChapter 3 (Cont III): Exploiting ILP with Software Approaches. Copyright Josep Torrellas 1999, 2001, 2002,
Chapter 3 (Cont III): Exploiting ILP with Software Approaches Copyright Josep Torrellas 1999, 2001, 2002, 2013 1 Exposing ILP (3.2) Want to find sequences of unrelated instructions that can be overlapped
More informationDivide-and-Conquer Algorithms
Supplment to A Practical Guie to Data Structures an Algorithms Using Java Divie-an-Conquer Algorithms Sally A Golman an Kenneth J Golman Hanout Divie-an-conquer algorithms use the following three phases:
More informationParallel Directionally Split Solver Based on Reformulation of Pipelined Thomas Algorithm
NASA/CR-1998-208733 ICASE Report No. 98-45 Parallel Directionally Split Solver Base on Reformulation of Pipeline Thomas Algorithm A. Povitsky ICASE, Hampton, Virginia Institute for Computer Applications
More informationA Neural Network Model Based on Graph Matching and Annealing :Application to Hand-Written Digits Recognition
ITERATIOAL JOURAL OF MATHEMATICS AD COMPUTERS I SIMULATIO A eural etwork Moel Base on Graph Matching an Annealing :Application to Han-Written Digits Recognition Kyunghee Lee Abstract We present a neural
More informationOverview : Computer Networking. IEEE MAC Protocol: CSMA/CA Internet mobility TCP over noisy links
Overview 15-441 15-441: Computer Networking 15-641 Lecture 24: Wireless Eric Anerson Fall 2014 www.cs.cmu.eu/~prs/15-441-f14 Internet mobility TCP over noisy links Link layer challenges an WiFi Cellular
More informationPairwise alignment using shortest path algorithms, Gunnar Klau, November 29, 2005, 11:
airwise alignment using shortest path algorithms, Gunnar Klau, November 9,, : 3 3 airwise alignment using shortest path algorithms e will iscuss: it graph Dijkstra s algorithm algorithm (GDU) 3. References
More informationEFFICIENT ON-LINE TESTING METHOD FOR A FLOATING-POINT ADDER
FFICINT ON-LIN TSTING MTHOD FOR A FLOATING-POINT ADDR A. Droz, M. Lobachev Department of Computer Systems, Oessa State Polytechnic University, Oessa, Ukraine Droz@ukr.net, Lobachev@ukr.net Abstract In
More informationParticle Swarm Optimization Based on Smoothing Approach for Solving a Class of Bi-Level Multiobjective Programming Problem
BULGARIAN ACADEMY OF SCIENCES CYBERNETICS AND INFORMATION TECHNOLOGIES Volume 17, No 3 Sofia 017 Print ISSN: 1311-970; Online ISSN: 1314-4081 DOI: 10.1515/cait-017-0030 Particle Swarm Optimization Base
More informationProvisioning Virtualized Cloud Services in IP/MPLS-over-EON Networks
Provisioning Virtualize Clou Services in IP/MPLS-over-EON Networks Pan Yi an Byrav Ramamurthy Department of Computer Science an Engineering, University of Nebraska-Lincoln Lincoln, Nebraska 68588 USA Email:
More information6.823 Computer System Architecture. Problem Set #3 Spring 2002
6.823 Computer System Architecture Problem Set #3 Spring 2002 Stuents are strongly encourage to collaborate in groups of up to three people. A group shoul han in only one copy of the solution to the problem
More informationLearning Subproblem Complexities in Distributed Branch and Bound
Learning Subproblem Complexities in Distribute Branch an Boun Lars Otten Department of Computer Science University of California, Irvine lotten@ics.uci.eu Rina Dechter Department of Computer Science University
More informationImpact of FTP Application file size and TCP Variants on MANET Protocols Performance
International Journal of Moern Communication Technologies & Research (IJMCTR) Impact of FTP Application file size an TCP Variants on MANET Protocols Performance Abelmuti Ahme Abbasher Ali, Dr.Amin Babkir
More informationImproving Spatial Reuse of IEEE Based Ad Hoc Networks
mproving Spatial Reuse of EEE 82.11 Base A Hoc Networks Fengji Ye, Su Yi an Biplab Sikar ECSE Department, Rensselaer Polytechnic nstitute Troy, NY 1218 Abstract n this paper, we evaluate an suggest methos
More informationNOW Handout Page 1. Review from Last Time #1. CSE 820 Graduate Computer Architecture. Lec 8 Instruction Level Parallelism. Outline
CSE 820 Graduate Computer Architecture Lec 8 Instruction Level Parallelism Based on slides by David Patterson Review Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level Parallelism
More informationCSE 820 Graduate Computer Architecture. week 6 Instruction Level Parallelism. Review from Last Time #1
CSE 820 Graduate Computer Architecture week 6 Instruction Level Parallelism Based on slides by David Patterson Review from Last Time #1 Leverage Implicit Parallelism for Performance: Instruction Level
More informationStudy of Network Optimization Method Based on ACL
Available online at www.scienceirect.com Proceia Engineering 5 (20) 3959 3963 Avance in Control Engineering an Information Science Stuy of Network Optimization Metho Base on ACL Liu Zhian * Department
More informationTECH. 9. Code Scheduling for ILP-Processors. Levels of static scheduling. -Eligible Instructions are
9. Code Scheduling for ILP-Processors Typical layout of compiler: traditional, optimizing, pre-pass parallel, post-pass parallel {Software! compilers optimizing code for ILP-processors, including VLIW}
More informationACE: And/Or-parallel Copying-based Execution of Logic Programs
ACE: An/Or-parallel Copying-base Execution of Logic Programs Gopal GuptaJ Manuel Hermenegilo* Enrico PontelliJ an Vítor Santos Costa' Abstract In this paper we present a novel execution moel for parallel
More informationComparison of Methods for Increasing the Performance of a DUA Computation
Comparison of Methos for Increasing the Performance of a DUA Computation Michael Behrisch, Daniel Krajzewicz, Peter Wagner an Yun-Pang Wang Institute of Transportation Systems, German Aerospace Center,
More informationUG4 Honours project selection: Talk to Vijay or Boris if interested in computer architecture projects
Announcements UG4 Honours project selection: Talk to Vijay or Boris if interested in computer architecture projects Inf3 Computer Architecture - 2017-2018 1 Last time: Tomasulo s Algorithm Inf3 Computer
More informationComputer Organization
Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.
More informationQueueing Model and Optimization of Packet Dropping in Real-Time Wireless Sensor Networks
Queueing Moel an Optimization of Packet Dropping in Real-Time Wireless Sensor Networks Marc Aoun, Antonios Argyriou, Philips Research, Einhoven, 66AE, The Netherlans Department of Computer an Communication
More informationYou Can Do That. Unit 16. Motivation. Computer Organization. Computer Organization Design of a Simple Processor. Now that you have some understanding
.. ou Can Do That Unit Computer Organization Design of a imple Clou & Distribute Computing (CyberPhysical, bases, Mining,etc.) Applications (AI, Robotics, Graphics, Mobile) ystems & Networking (Embee ystems,
More informationUninformed search methods
CS 1571 Introuction to AI Lecture 4 Uninforme search methos Milos Hauskrecht milos@cs.pitt.eu 539 Sennott Square Announcements Homework assignment 1 is out Due on Thursay, September 11, 014 before the
More informationSolution Representation for Job Shop Scheduling Problems in Ant Colony Optimisation
Solution Representation for Job Shop Scheuling Problems in Ant Colony Optimisation James Montgomery, Carole Faya 2, an Sana Petrovic 2 Faculty of Information & Communication Technologies, Swinburne University
More informationECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation
ECE902 Virtual Machine Final Project: MIPS to CRAY-2 Binary Translation Weiping Liao, Saengrawee (Anne) Pratoomtong, and Chuan Zhang Abstract Binary translation is an important component for translating
More informationWorkspace as a Service: an Online Working Environment for Private Cloud
2017 IEEE Symposium on Service-Oriente System Engineering Workspace as a Service: an Online Working Environment for Private Clou Bo An 1, Xuong Shan 1, Zhicheng Cui 1, Chun Cao 2, Donggang Cao 1 1 Institute
More informationOffloading Cellular Traffic through Opportunistic Communications: Analysis and Optimization
1 Offloaing Cellular Traffic through Opportunistic Communications: Analysis an Optimization Vincenzo Sciancalepore, Domenico Giustiniano, Albert Banchs, Anreea Picu arxiv:1405.3548v1 [cs.ni] 14 May 24
More informationImage Segmentation using K-means clustering and Thresholding
Image Segmentation using Kmeans clustering an Thresholing Preeti Panwar 1, Girhar Gopal 2, Rakesh Kumar 3 1M.Tech Stuent, Department of Computer Science & Applications, Kurukshetra University, Kurukshetra,
More informationPreamble. Singly linked lists. Collaboration policy and academic integrity. Getting help
CS2110 Spring 2016 Assignment A. Linke Lists Due on the CMS by: See the CMS 1 Preamble Linke Lists This assignment begins our iscussions of structures. In this assignment, you will implement a structure
More informationFast Fractal Image Compression using PSO Based Optimization Techniques
Fast Fractal Compression using PSO Base Optimization Techniques A.Krishnamoorthy Visiting faculty Department Of ECE University College of Engineering panruti rishpci89@gmail.com S.Buvaneswari Visiting
More informationNearest Neighbor Search using Additive Binary Tree
Nearest Neighbor Search using Aitive Binary Tree Sung-Hyuk Cha an Sargur N. Srihari Center of Excellence for Document Analysis an Recognition State University of New York at Buffalo, U. S. A. E-mail: fscha,sriharig@cear.buffalo.eu
More informationDistributed Planning in Hierarchical Factored MDPs
Distribute Planning in Hierarchical Factore MDPs Carlos Guestrin Computer Science Dept Stanfor University guestrin@cs.stanfor.eu Geoffrey Goron Computer Science Dept Carnegie Mellon University ggoron@cs.cmu.eu
More informationThreshold Based Data Aggregation Algorithm To Detect Rainfall Induced Landslides
Threshol Base Data Aggregation Algorithm To Detect Rainfall Inuce Lanslies Maneesha V. Ramesh P. V. Ushakumari Department of Computer Science Department of Mathematics Amrita School of Engineering Amrita
More informationMORA: a Movement-Based Routing Algorithm for Vehicle Ad Hoc Networks
: a Movement-Base Routing Algorithm for Vehicle A Hoc Networks Fabrizio Granelli, Senior Member, Giulia Boato, Member, an Dzmitry Kliazovich, Stuent Member Abstract Recent interest in car-to-car communications
More informationProbabilistic Medium Access Control for. Full-Duplex Networks with Half-Duplex Clients
Probabilistic Meium Access Control for 1 Full-Duplex Networks with Half-Duplex Clients arxiv:1608.08729v1 [cs.ni] 31 Aug 2016 Shih-Ying Chen, Ting-Feng Huang, Kate Ching-Ju Lin, Member, IEEE, Y.-W. Peter
More informationNon-homogeneous Generalization in Privacy Preserving Data Publishing
Non-homogeneous Generalization in Privacy Preserving Data Publishing W. K. Wong, Nios Mamoulis an Davi W. Cheung Department of Computer Science, The University of Hong Kong Pofulam Roa, Hong Kong {wwong2,nios,cheung}@cs.hu.h
More informationState Indexed Policy Search by Dynamic Programming. Abstract. 1. Introduction. 2. System parameterization. Charles DuHadway
State Inexe Policy Search by Dynamic Programming Charles DuHaway Yi Gu 5435537 503372 December 4, 2007 Abstract We consier the reinforcement learning problem of simultaneous trajectory-following an obstacle
More informationA Plane Tracker for AEC-automation Applications
A Plane Tracker for AEC-automation Applications Chen Feng *, an Vineet R. Kamat Department of Civil an Environmental Engineering, University of Michigan, Ann Arbor, USA * Corresponing author (cforrest@umich.eu)
More informationGeneralized Edge Coloring for Channel Assignment in Wireless Networks
Generalize Ege Coloring for Channel Assignment in Wireless Networks Chun-Chen Hsu Institute of Information Science Acaemia Sinica Taipei, Taiwan Da-wei Wang Jan-Jan Wu Institute of Information Science
More informationA Metric for Routing in Delay-Sensitive Wireless Sensor Networks
A Metric for Routing in Delay-Sensitive Wireless Sensor Networks Zhen Jiang Jie Wu Risa Ito Dept. of Computer Sci. Dept. of Computer & Info. Sciences Dept. of Computer Sci. West Chester University Temple
More informationAdditional Divide and Conquer Algorithms. Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication
Aitional Divie an Conquer Algorithms Skipping from chapter 4: Quicksort Binary Search Binary Tree Traversal Matrix Multiplication Divie an Conquer Closest Pair Let s revisit the closest pair problem. Last
More informationRobust PIM-SM Multicasting using Anycast RP in Wireless Ad Hoc Networks
Robust PIM-SM Multicasting using Anycast RP in Wireless A Hoc Networks Jaewon Kang, John Sucec, Vikram Kaul, Sunil Samtani an Mariusz A. Fecko Applie Research, Telcoria Technologies One Telcoria Drive,
More informationDense Disparity Estimation in Ego-motion Reduced Search Space
Dense Disparity Estimation in Ego-motion Reuce Search Space Luka Fućek, Ivan Marković, Igor Cvišić, Ivan Petrović University of Zagreb, Faculty of Electrical Engineering an Computing, Croatia (e-mail:
More informationHandling missing values in kernel methods with application to microbiology data
an Machine Learning. Bruges (Belgium), 24-26 April 2013, i6oc.com publ., ISBN 978-2-87419-081-0. Available from http://www.i6oc.com/en/livre/?gcoi=28001100131010. Hanling missing values in kernel methos
More informationOn Effectively Determining the Downlink-to-uplink Sub-frame Width Ratio for Mobile WiMAX Networks Using Spline Extrapolation
On Effectively Determining the Downlink-to-uplink Sub-frame With Ratio for Mobile WiMAX Networks Using Spline Extrapolation Panagiotis Sarigianniis, Member, IEEE, Member Malamati Louta, Member, IEEE, Member
More information5008: Computer Architecture
5008: Computer Architecture Chapter 2 Instruction-Level Parallelism and Its Exploitation CA Lecture05 - ILP (cwliu@twins.ee.nctu.edu.tw) 05-1 Review from Last Lecture Instruction Level Parallelism Leverage
More informationTable-based division by small integer constants
Table-base ivision by small integer constants Florent e Dinechin, Laurent-Stéphane Diier LIP, Université e Lyon (ENS-Lyon/CNRS/INRIA/UCBL) 46, allée Italie, 69364 Lyon Ceex 07 Florent.e.Dinechin@ens-lyon.fr
More informationBackpressure-based Packet-by-Packet Adaptive Routing in Communication Networks
1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stolyar Abstract Backpressure-base aaptive routing
More informationSlicing Long Running Queries
licing Long unning Queries Nicolas Bruno Microsoft esearch nicolasb@microsoft.com Vivek Narasayya Microsoft esearch viveknar@microsoft.com avi amamurthy Microsoft esearch ravirama@microsoft.com ABTACT
More informationNon-Uniform Sensor Deployment in Mobile Wireless Sensor Networks
01 01 01 01 01 00 01 01 Non-Uniform Sensor Deployment in Mobile Wireless Sensor Networks Mihaela Carei, Yinying Yang, an Jie Wu Department of Computer Science an Engineering Floria Atlantic University
More informationGetting CPI under 1: Outline
CMSC 411 Computer Systems Architecture Lecture 12 Instruction Level Parallelism 5 (Improving CPI) Getting CPI under 1: Outline More ILP VLIW branch target buffer return address predictor superscalar more
More informationOn the Role of Multiply Sectioned Bayesian Networks to Cooperative Multiagent Systems
On the Role of Multiply Sectione Bayesian Networks to Cooperative Multiagent Systems Y. Xiang University of Guelph, Canaa, yxiang@cis.uoguelph.ca V. Lesser University of Massachusetts at Amherst, USA,
More informationOptimizing the quality of scalable video streams on P2P Networks
Optimizing the quality of scalable vieo streams on PP Networks Paper #7 ASTRACT The volume of multimeia ata, incluing vieo, serve through Peer-to-Peer (PP) networks is growing rapily Unfortunately, high
More informationNatural Language Processing (CSE 517): Dependency Syntax and Parsing
Natural Language Processing (CSE 517): Depenency Syntax an Parsing Noah A. Smith Swabha Swayamipta c 2018 University of Washington {nasmith,swabha}@cs.washington.eu May 11, 2018 1 / 93 Recap: Phrase Structure
More informationPART 5. Process Coordination And Synchronization
PART 5 Process Coorination An Synchronization CS 503 - PART 5 1 2010 Location Of Process Coorination In The Xinu Hierarchy CS 503 - PART 5 2 2010 Coorination Of Processes Necessary in a concurrent system
More informationComputer Organization
Computer Organization Douglas Comer Computer Science Department Purue University 250 N. University Street West Lafayette, IN 47907-2066 http://www.cs.purue.eu/people/comer Copyright 2006. All rights reserve.
More informationSocially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach
Socially-optimal ISP-aware P2P Content Distribution via a Primal-Dual Approach Jian Zhao, Chuan Wu The University of Hong Kong {jzhao,cwu}@cs.hku.hk Abstract Peer-to-peer (P2P) technology is popularly
More informationInstruction Scheduling
Instruction Scheduling Michael O Boyle February, 2014 1 Course Structure Introduction and Recap Course Work Scalar optimisation and dataflow L5 Code generation L6 Instruction scheduling Next register allocation
More informationDeltaPath: Precise and Scalable Calling Context Encoding
DeltaPath: Precise an Scalable Calling Context Encoing Qiang Zeng, Junghwan Rhee, Hui Zhang, Nipun rora, Guofei Jiang, Peng Liu Penn State University, NEC Laboratories merica BSTRCT Calling context provies
More informationCS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines
CS377P Programming for Performance Single Thread Performance Out-of-order Superscalar Pipelines Sreepathi Pai UTCS September 14, 2015 Outline 1 Introduction 2 Out-of-order Scheduling 3 The Intel Haswell
More informationBackpressure-based Packet-by-Packet Adaptive Routing in Communication Networks
1 Backpressure-base Packet-by-Packet Aaptive Routing in Communication Networks Eleftheria Athanasopoulou, Loc Bui, Tianxiong Ji, R. Srikant, an Alexaner Stoylar arxiv:15.4984v1 [cs.ni] 27 May 21 Abstract
More informationAdjacency Matrix Based Full-Text Indexing Models
1000-9825/2002/13(10)1933-10 2002 Journal of Software Vol.13, No.10 Ajacency Matrix Base Full-Text Inexing Moels ZHOU Shui-geng 1, HU Yun-fa 2, GUAN Ji-hong 3 1 (Department of Computer Science an Engineering,
More informationDistributing Frank-Wolfe via Map-Reduce
Distributing Frank-Wolfe via Map-Reuce Armin Moharrer an Stratis Ioanniis ortheastern University amoharrer@ece.neu.eu, ioanniis@ece.neu.eu Abstract We ientify structural properties uner which a convex
More informationPatterns for! Parallel Programming II!
Lecture 4! Patterns for! Parallel Programming II! John Cavazos! Dept of Computer & Information Sciences! University of Delaware! www.cis.udel.edu/~cavazos/cisc879! Task Decomposition Also known as functional
More informationFrequent Pattern Mining. Frequent Item Set Mining. Overview. Frequent Item Set Mining: Motivation. Frequent Pattern Mining comprises
verview Frequent Pattern Mining comprises Frequent Pattern Mining hristian Borgelt School of omputer Science University of Konstanz Universitätsstraße, Konstanz, Germany christian.borgelt@uni-konstanz.e
More informationLearning Polynomial Functions. by Feature Construction
I Proceeings of the Eighth International Workshop on Machine Learning Chicago, Illinois, June 27-29 1991 Learning Polynomial Functions by Feature Construction Richar S. Sutton GTE Laboratories Incorporate
More informationCrusoe Reference. What is Binary Translation. What is so hard about it? Thinking Outside the Box The Transmeta Crusoe Processor
Crusoe Reference Thinking Outside the Box The Transmeta Crusoe Processor 55:132/22C:160 High Performance Computer Architecture The Technology Behind Crusoe Processors--Low-power -Compatible Processors
More informationA Novel Density Based Clustering Algorithm by Incorporating Mahalanobis Distance
Receive: November 20, 2017 121 A Novel Density Base Clustering Algorithm by Incorporating Mahalanobis Distance Margaret Sangeetha 1 * Velumani Paikkaramu 2 Rajakumar Thankappan Chellan 3 1 Department of
More informationOn the Placement of Internet Taps in Wireless Neighborhood Networks
1 On the Placement of Internet Taps in Wireless Neighborhoo Networks Lili Qiu, Ranveer Chanra, Kamal Jain, Mohamma Mahian Abstract Recently there has emerge a novel application of wireless technology that
More informationDemystifying Automata Processing: GPUs, FPGAs or Micron s AP?
Demystifying Automata Processing: GPUs, FPGAs or Micron s AP? Marziyeh Nourian 1,3, Xiang Wang 1, Xiaoong Yu 2, Wu-chun Feng 2, Michela Becchi 1,3 1,3 Department of Electrical an Computer Engineering,
More informationPerformance Modelling of Necklace Hypercubes
erformance Moelling of ecklace ypercubes. Meraji,,. arbazi-aza,, A. atooghy, IM chool of Computer cience & harif University of Technology, Tehran, Iran {meraji, patooghy}@ce.sharif.eu, aza@ipm.ir a Abstract
More informationSMARQ: Software-Managed Alias Register Queue for Dynamic Optimizations
SMARQ: SoftwareManaged Alias Register Queue for Dynamic Optimizations heng Wang Youfeng Wu Hongbo Rong Hyunchul ark rogramming Systems Lab Microprocessor and rogramming Research Intel Labs {cheng.c.wang,
More informationDisjoint Multipath Routing in Dual Homing Networks using Colored Trees
Disjoint Multipath Routing in Dual Homing Networks using Colore Trees Preetha Thulasiraman, Srinivasan Ramasubramanian, an Marwan Krunz Department of Electrical an Computer Engineering University of Arizona,
More informationScalable Deterministic Scheduling for WDM Slot Switching Xhaul with Zero-Jitter
FDL sel. VOA SOA 100 Regular papers ONDM 2018 Scalable Deterministic Scheuling for WDM Slot Switching Xhaul with Zero-Jitter Bogan Uscumlic 1, Dominique Chiaroni 1, Brice Leclerc 1, Thierry Zami 2, Annie
More informationLab work #8. Congestion control
TEORÍA DE REDES DE TELECOMUNICACIONES Grao en Ingeniería Telemática Grao en Ingeniería en Sistemas e Telecomunicación Curso 2015-2016 Lab work #8. Congestion control (1 session) Author: Pablo Pavón Mariño
More informationOptimising for the p690 memory system
Optimising for the p690 memory Introduction As with all performance optimisation it is important to understand what is limiting the performance of a code. The Power4 is a very powerful micro-processor
More informationSuperscalar Processing (5) Superscalar Processors Ch 14. New dependency for superscalar case? (8) Output Dependency?
Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion
More informationLecture 10: Static ILP Basics. Topics: loop unrolling, static branch prediction, VLIW (Sections )
Lecture 10: Static ILP Basics Topics: loop unrolling, static branch prediction, VLIW (Sections 4.1 4.4) 1 Static vs Dynamic Scheduling Arguments against dynamic scheduling: requires complex structures
More informationSuperscalar Processors Ch 14
Superscalar Processors Ch 14 Limitations, Hazards Instruction Issue Policy Register Renaming Branch Prediction PowerPC, Pentium 4 1 Superscalar Processing (5) Basic idea: more than one instruction completion
More informationTop-down Connectivity Policy Framework for Mobile Peer-to-Peer Applications
Top-own Connectivity Policy Framework for Mobile Peer-to-Peer Applications Otso Kassinen Mika Ylianttila Junzhao Sun Jussi Ala-Kurikka MeiaTeam Department of Electrical an Information Engineering University
More informationThe Journal of Systems and Software
The Journal of Systems an Software 83 (010) 1864 187 Contents lists available at ScienceDirect The Journal of Systems an Software journal homepage: www.elsevier.com/locate/jss Embeing capacity raising
More informationEnabling Rollback Support in IT Change Management Systems
Enabling Rollback Support in IT Change Management Systems Guilherme Sperb Machao, Fábio Fabian Daitx, Weverton Luis a Costa Coreiro, Cristiano Bonato Both, Luciano Paschoal Gaspary, Lisanro Zambeneetti
More informationData Mining: Clustering
Bi-Clustering COMP 790-90 Seminar Spring 011 Data Mining: Clustering k t 1 K-means clustering minimizes Where ist ( x, c i t i c t ) ist ( x m j 1 ( x ij i, c c t ) tj ) Clustering by Pattern Similarity
More informationMOBILE data is continuing to grow exponentially, and
IEEE/ACM TRANSACTIONS ON NETWORKING, VOL. 25, NO. 3, JUNE 2017 1431 Enhancing Mobile Networks With Software Define Networking an Clou Computing Zizhong Cao, Stuent Member, IEEE, Shivenra S. Panwar, Fellow,
More information