Optimization and Parallelization of Sequential Programs
|
|
- Claude Bishop
- 5 years ago
- Views:
Transcription
1 DF Advanced Compler Constructon TDDC86 Compler optmzatons and code generaton Optmzaton and Parallelzaton of Sequental Programs Lecture 7 Chrstoph Kessler IDA / PELAB Lnköpng Unversty Sweden Outlne Towards (sem-)automatc parallelzaton of sequental programs Data dependence analyss for loops Some loop transformatons Loop nvarant code hostng, loop unrollng, loop fuson, loop nterchange, loop blockng and tlng Statc loop parallelzaton Run-tme loop parallelzaton Doacross parallelzaton, Inspector-executor method Speculatve parallelzaton (as tme permts) Auto-tunng (later, f tme) Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. Foundatons: Control and Data Dependence Foundatons: Control and Data Dependence Consder statements S, T n a sequental program (S=T possble) Scope of analyss s typcally a functon,.e. ntra-procedural analyss Assume that a control flow path S T s possble Can be done at arbtrary granularty (nstructons, operatons, statements, compound statements, program regons) Relevant are only the read and wrte effects on memory (.e. on program varables) by each operaton, and the effect on control flow Control dependence S T, f the fact whether T s executed may depend on S (e.g. condton) Imples that relatve executon order S T must be preserved when restructurng the program S: f () { T: Mostly obvous from nestng structure n well-structured programs, but more trcky n arbtrary branchng code (e.g. assembler code) Data dependence S T, f statement S may execute (dynamcally) before T and both may access the same memory locaton and at least one of these accesses s a wrte Means that executon order S before T must be preserved when restructurng the program In general, only a conservatve over-estmaton can be determned statcally flow dependence: (RAW, read-after-wrte) S may wrte a locaton z that T may read ant dependence: (WAR, wrte-after-read) S may read a locaton x that T may overwrtes output dependence: (WAW, wrte-after-wrte) both S and T may wrte the same locaton S: z = ; T: =..z.. ; (flow dependence) 3 4 Dependence Graph (Data, Control, Program) Dependence Graph: Drected graph, consstng of all statements as vertces and all (data, control, any) dependences as edges. Why Loop Optmzaton and Parallelzaton? Loops are a promsng obect for program optmzatons, ncludng automatc parallelzaton: Hgh executon frequency Most computaton done n (nner) loops Even small optmzatons can have large mpact (cf. Amdahl s Law) Regular, repettve behavor compact descrpton relatvely smple to analyze statcally Well researched 5 6
2 Loop Optmzatons General Issues Move loop nvarant computatons out of loops Modfy the order of teratons or parts thereof Goals: Improve data access localty Faster executon Reduce loop control overhead Enhance possbltes for loop parallelzaton or vectorzaton DF Advanced Compler Constructon TDDC86 Compler optmzatons and code generaton Data Dependence Analyss for Loops A more formal ntroducton Only transformatons that preserve the program semantcs (ts nput/output behavor) are admssble Conservatve (statc) crterum: preserve data dependences Need data dependence analyss for loops 7 Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. Data Dependence Analyss Overvew Precedence relaton between statements Important for loop optmzatons, vectorzaton and parallelzaton, nstructon schedulng, data cache optmzatons Conservatve approxmatons to dsontness of pars of memory accesses weaker than data-flow analyss but generalzes ncely to the level of ndvdual array element Loops, loop nests Iteraton space Array subscrpts n loops Index space Dependence testng methods Data dependence graph Data + control dependence graph Program dependence graph 9 Data Dependence Graph Loop Iteraton Space Data dependence graph for straght-lne code ( basc block, no branchng) s always acyclc, because relatve executon order of statements s forward only. Data dependence graph for a loop: Dependence edge S T f a dependence may exst for some par of nstances (teratons) of S, T Cycles possble Loop-ndependent versus loop-carred dependences (assumng we know statcally that arrays a and b do not ntersect)
3 Example Loop Normalzaton (assumng that we statcally know that arrays A, X, Y, Z do not ntersect, otherwse there mght be further dependences) (Iteratons unrolled) Data dependence graph: S S 3 Dependence Dstance and Drecton 5 Lnear Dophantne Equatons 7 4 Dependence Equaton System 6 Dependence Testng, : GCD-Test 8 3
4 For multdmensonal arrays? 9 Survey of Dependence Tests DF Advanced Compler Constructon Loop Optmzatons General Issues TDDC86 Compler optmzatons and code generaton Move loop nvarant computatons out of loops Modfy the order of teratons or parts thereof Loop Transformatons and Parallelzaton Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. Some mportant loop transformatons Loop normalzaton Goals: Improve data access localty Faster executon Reduce loop control overhead Enhance possbltes for loop parallelzaton or vectorzaton Only transformatons that preserve the program semantcs (ts nput/output behavor) are admssble Conservatve (statc) crterum: preserve data dependences Need data dependence analyss for loops Loop Invarant Code Hostng Move loop nvarant code out of the loop Loop parallelzaton Loop nvarant code hostng Loop nterchange Loop fuson vs. Loop dstrbuton / fsson Strp-mnng / loop tlng / blockng vs. Loop lnearzaton Loop unrollng, unroll-and-am Complers can do ths automatcally f they can statcally fnd out what code s loop nvarant for (=; <; ++) tmp = c / d; a[] = b[] + c / d; for (=; <; ++) a[] = b[] + tmp; Loop peelng Index set splttng, Loop unswtchng Scalar replacement, Scalar expanson Later: Software ppelnng More: Cycle shrnkng, Loop skewng,
5 Loop Unrollng Loop Interchange () Loop unrollng For properly nested loops Can be enforced wth compler optons e.g. funroll= (statements n nnermost loop body only) for (=; <5; ++) { a[] = b[]; Example : for (=; <M; ++) for ( =; <5; +=) { Unroll by : a[][] Reduces loop overhead (total # comparsons, branches, ncrements) a[][] row-wse storage of D-arrays n C, Java a[][m-] new teraton order Longer loop body may enable further local optmzatons (e.g. common subexpresson elmnaton, regster allocaton, nstructon schedulng, usng SIMD nstructons) old teraton order a[n-][] a[n-][] longer code Exercse: Formulate the unrollng rule for lmt C. Kessler, IDA, Lnköpngs unverstet. 5 statcally unknown TDDD56 upper Multcoreloop and GPU Programmng Foundatons: Loop-Carred Data Dependences Can mprove data access localty n memory herarchy (fewer cache msses / page faults) 6 Loop Interchange () Recall: Data dependence S T, S: z = ; f operaton S may execute (dynamcally) before operaton T and both may access the same memory locaton T: =..z.. ; and at least one of these accesses s a wrte In general, only a conservatve over-estmaton can be determned statcally. Be careful wth loop carred data dependences! = =3 T T T3 S S S3 for (=; <N; ++) for (=; <N; ++) for (=; <M; ++) a[][] =a[+][-]...; Iteraton space: =N- a[][] =a[+][-]; f the data dependence S T may exst for nstances of S and T n dfferent teratons of L. Iteraton space: = Example : for (=; <M; ++) Data dependence S T s called loop carred by a loop L TN- SN- Iteraton (,) reads locaton a[+][-] that was wrtten n an earler teraton, (-,+) Iteraton (,) reads locaton a[+][-], that wll be overwrtten n a later teraton (+,-) new teraton order old teraton order partal order between the operaton nstances resp. teratons 7 a[ ][ ] =. ; for (=; <M; ++) a[ ][ ] =. ; a[+] = b[+]; L: for (=; <N; ++) { T: = x[ - ]; S: x[ ] = ; for (=; <N; ++) for (=; <N; ++) a[] = b[]; Interchangng the loop headers would volate the partal teraton order gven by the data dependences 8 Loop Interchange (3) Loop Fuson Be careful wth loop-carred data dependences! Merge subsequent loops wth same header Example 3: for (=; <M; ++) for (=; <N; ++) for (=; <N; ++) OK for (=; <M; ++) a[][] =a[-][-]...; a[][] =a[-][-]; Iteraton space: new teraton order old teraton order Iteraton (,) reads locaton a[-][-] that was wrtten n earler teraton (-,-) Generally: Interchangng loop headers s only admssble f loop-carred dependences have the same drecton for all loops n the loop nest (all drected along or all aganst the teraton order) 9 Safe f nether loop carres a (backward) dependence for (=; <N; ++) for (= ; <N; ++) { a[ ] = ; a[ ] = ; for (=; <N; ++) Iteraton (,) reads locaton a[-][-] that was wrtten n earler teraton (-,-) = a[ ] ; = a[ ] ; OK Read of a[] stll after wrte of a[], for all For N suffcently large, a[] wll no longer be n the cache at ths tme Can mprove data access localty and reduces number of branches 3 5
6 Loop Iteraton Reorderng Loop Parallelzaton -loop carres a dependence, ts teraton order must be preserved -loop carres a dependence, ts teraton order must be preserved Loop parallelzaton 3 Remark on Loop Parallelzaton 3 Strp Mnng / Loop Blockng / -Tlng Introducng temporary copes of arrays can remove some antdependences to enable automatc loop parallelzaton for (=; <n; ++) a[] = a[] + a[+]; The loop-carred dependence can be elmnated: for (=; <n; ++) aold[+] = a[+]; for (=; <n; ++) a[] = a[] + aold[+]; Parallelzable loop Parallelzable loop Tled Matrx-Matrx Multplcaton () Tled Matrx-Matrx Multplcaton () Matrx-Matrx multplcaton C = A x B Block each loop by block sze S here for square (n x n) matrces C, A, B, wth n large (~3): C = S k=..n A k B k for all, =...n A for (=; <n; +=S) for (kk=; kk<n; kk+=s) k for (=; <n; ++) k kk kk for (=; <n; +=S) (here wthout the ntalzaton of C-entres to ): k B Good spatal localty for A, B and C for (=; < +S; ++) for (=; < +S; ++) for (k=; k<n; k++) C[][] += A[][k] * B[k][]; Code after tlng: Standard algorthm for Matrx-Matrx multplcaton for (=; <n; ++) (choose S so that a block of A, B, C ft n cache together), k then nterchange loops k 35 Good spatal localty on A, C for (k=kk; k < kk+s; k++) Bad spatal localty on B (many capacty msses) C[][] += A[][k] * B[k][]; 36 6
7 Remark on Localty Transformatons Loop Dstrbuton (a.k.a. Loop Fsson) An alternatve can be to change the data layout rather than the control structure of the program Store matrx B n transposed form, or, f necessary, consder transposng t, whch may pay off over several subsequent computatons Fndng the best layout for all multdmensonal arrays s a NP-complete optmzaton problem [Mace, 988] Recursve array layouts that preserve localty Morton-order Herarchcally layout tled arrays In the best case, can make computatons cache-oblvous Performance largely ndependent of TDDD56 cache sze 37 Multcore and GPU Programmng Loop Fuson 38 Loop Nest Flattenng / Lnearzaton 39 Loop Unrollng 4 Loop Unrollng wth Unknown Upper Bound 4 4 7
8 Loop Unroll-And-Jam 43 Loop Peelng Index Set Splttng Loop Unswtchng 45 Scalar Replacement Scalar Expanson / Array Prvatzaton
9 Idom recognton and algorthm replacement DF Advanced Compler Constructon TDDC86 Compler optmzatons and code generaton C. Kessler: Pattern-drven automatc parallelzaton. Scentfc Programmng, 996. Concludng Remarks A. Shafee-Sarvestan, E. Hansson, C. Kessler: Extensble recognton of algorthmc patterns n DSP programs for automatc parallelzaton. Int. J. on Parallel Programmng, Lmts of Statc Analyzablty Outlook: Runtme Analyss and Parallelzaton Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. Remark on statc analyzablty () Remark on statc analyzablty () Statc dependence nformaton s always a (safe) Statc dependence nformaton s always a (safe) overapproxmaton of the real (run-tme) dependences overapproxmaton of the real (run-tme) dependences Fndng out the real ones exactly s statcally undecdable! Fndng out the latter exactly s statcally undecdable! If n doubt, a dependence must be assumed may prevent some optmzatons or parallelzaton If n doubt, a dependence must be assumed may prevent some optmzatons or parallelzaton One man reason for mprecson s alasng,.e. the program may have several ways to refer to the same memory locaton Ponter alasng vod mergesort ( nt* a, nt n ) { mergesort ( a, n/ ); mergesort ( a + n/, n-n/ ); 5 Another reason for mprecson are statcally unknown values that mply whether a dependence exsts or not Unknown dependence dstance // value of K statcally unknown Loop-carred dependence for ( =; <N; ++ ) f K < N. Otherwse, the loop s { parallelzable. S: a[] = a[] + a[k]; 5 How could a statc analyss tool (e.g., compler) know that the two recursve calls read and wrte dsont subarrays of a? Outlook: Runtme Parallelzaton DF Advanced Compler Constructon TDDC86 Compler optmzatons and code generaton Run-Tme Parallelzaton 53 Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. 9
10 Goal of run-tme parallelzaton Overvew Typcal target: rregular loops Run-tme parallelzaton of rregular loops for ( =; <n; ++) a[] = f ( a[ g() ], a[ h() ],... ); DOACROSS parallelzaton Inspector-Executor Technque (shared memory) Array ndex expressons g, h... depend on run-tme data Inspector-Executor Technque (message passng) * Prvatzng DOALL Test * Iteratons cannot be statcally proved ndependent (and not ether dependent wth dstance +) Speculatve run-tme parallelzaton of rregular loops * Prncple: At runtme, nspect g, h... to fnd out the real dependences and compute a schedule for partally parallel executon Can also be combned wth speculatve parallelzaton LRPD Test * General Thread-Level Speculaton 55 Hardware support * * = not covered n ths course. See the references. 56 DOACROSS Parallelzaton Inspector-Executor Technque () Useful f loop-carred dependence dstances are unknown, but often > Compler generates peces of customzed code for such loops: Allow ndependent subsequent loop teratons to overlap Blateral synchronzaton between really-dependent teratons Inspector for ( =; <n; ++) a[] = f ( a[ g() ],... ); calculates values of ndex expresson by smulatng whole loop executon typcally, based on sequental verson of the source loop (some computatons could be left out) sh float aold[n]; sh flag done[n]; // flag (semaphore) array forall n..n- { // spawn n threads, one per teraton done[n] = ; aold[] = a[]; // create a copy forall n..n- { // spawn n threads, one per teraton f (g() < ) wat untl done[ g() ] ); a[] = f ( a[ g() ],... ); set( done[] ); else a[] = f ( aold[ g() ],... ); set done[]; 57 Inspector-Executor Technque () computes mplctly the real teraton dependence graph computes a parallel schedule as (greedy) wavefront traversal of the teraton dependence graph n topologcal order all teratons n same wavefront are ndependent schedule depth = #wavefronts = crtcal path length Executor follows ths schedule to execute the loop for ( =; <n; ++) a[] = f ( a[ g() ], a[ h() ],... ); for (=; <n; ++) a[] =... a[ g() ]...; Inspector: nt wf[n]; // wavefront ndces nt depth = ; for (=; <n; ++) wf[] = ; // nt. for (=; <n; ++) { wf[] = max ( wf[ g() ], wf[ h() ],... ) + ; depth = max ( depth, wf[] ); Inspector consders only flow dependences (RAW), ant- and output dependences to be preserved by executor 59 Inspector-Executor Technque (3) Source loop: 58 Executor: g() wf[] g()<? no yes no yes yes yes float aold[n]; // buffer array aold[:n] = a[:n]; 3 4 for (w=; w<depth; w++) forall (,, n, #) f (wf[] == w) { 5 a = (g() < )? a[g()] : aold[g()];... // smlarly, a for h etc. a[] = f ( a, a,... ); teraton (flow) dependence graph 6
11 Inspector-Executor Technque (4) DF Advanced Compler Constructon TDDC86 Compler optmzatons and code generaton Problem: Inspector remans sequental no speedup Soluton approaches: Re-use schedule over subsequent teratons of an outer loop f access pattern does not change Thread-Level Speculaton amortzes nspector overhead across repeated executons Parallelze the nspector usng doacross parallelzaton [Saltz,Mrchandaney 9] Parallelze the nspector usng sectonng [Leung/Zahoran 9] compute processor-local wavefronts n parallel, concatenate trade-off schedule qualty (depth) vs. nspector speed Parallelze the nspector usng bootstrappng [Leung/Z. 9] Start wth suboptmal schedule by sectonng, use ths to execute the nspector refned schedule 6 Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. Speculatvely parallel executon TLS Example For automatc parallelzaton of sequental code where dependences are hard to analyze statcally Works on a task graph constructed mplctly and dynamcally Speculate on: control flow, data ndependence, synchronzaton, values We focus on thread-level speculaton (TLS) for CMP/MT processors. Speculatve nstructon-level parallelsm s not consdered here. Task: statcally: Connected, sngle-entry subgraph of the controlflow graph Basc blocks, loop bodes, loops, or entre functons dynamcally: Contguous fragment of dynamc nstructon stream wthn statc task regon, entered at statc task entry 63 Data dependence problem n TLS Explotng module-level speculatve parallelsm (across functon calls) Source: F. Warg: Technques for Reducng Thread-Level Speculaton Overhead n Chp Multprocessors. PhD thess, Chalmers TH, Gothenburg, June Speculatvely parallel executon of tasks Speculaton on nter-task control flow After havng assgned a task, predct ts successor task and start t speculatvely Speculaton on data ndependence For nter-task memory data (flow) dependences conservatvely: speculatvely: awat wrte (memory synchronzaton, message) hope for ndependence and contnue (execute the load) Roll-back of speculatve results on ms-speculaton (expensve) When startng speculaton, state must be buffered Squash an offendng task and all ts successors, restart Commt speculatve results when speculaton resolved to correct Source: F. Warg: Technques for Reducng Thread-Level Speculaton Overhead n Chp Multprocessors. PhD thess, Chalmers TH, Gothenburg, June Task s retred 66
12 Selectng Tasks for Speculaton TLS Implementatons Small tasks: Software-only speculaton too much overhead (task startup, task retrement) low parallelsm degree Large tasks: hgher msspeculaton probablty hgher rollback cost many speculatons ongong n parallel may saturate the resources Load balancng ssues avod large varaton n task szes Traversal of the program s control flow graph (CFG) for loops [Rauchwerger, Padua 94, 95]... Hardware-based speculaton Typcally, ntegrated n cache coherence protocols Used wth multthreaded processors / chp multprocessors for automatc parallelzaton of sequental legacy code If source code avalable, compler may help e.g. wth dentfyng sutable threads Heurstcs for task sze, control and data dep. speculaton Some references on Dependence Analyss, Loop optmzatons and Transformatons DF Advanced Compler Constructon TDDC86 Compler optmzatons and code generaton H. Zma, B. Chapman: Supercomplers for Parallel and Vector Computers. Addson-Wesley / ACM press, 99. M. Wolfe: Hgh-Performance Complers for Parallel Computng. Addson-Wesley, 996. Questons? R. Allen, K. Kennedy: Optmzng Complers for Modern Archtectures. Morgan Kaufmann,. Chrstoph Kessler, IDA, Lnköpngs unverstet, 4. Some references on run-tme parallelzaton Idom recognton and algorthm replacement: C. Kessler: Pattern-drven automatc parallelzaton. Scentfc Programmng 5:5-74, 996. A. Shafee-Sarvestan, E. Hansson, C. Kessler: Extensble recognton of algorthmc patterns n DSP programs for automatc paral-lelzaton. Int. J. on Parallel Programmng, 3. C. Kessler, IDA, Lnköpngs unverstet. 7 Some references on speculatve executon / parallelzaton R. Cytron: Doacross: Beyond vectorzaton for multprocessors. Proc. ICPP-986 D. Chen, J. Torrellas, P. Yew: An Effcent Algorthm for the Run-tme Parallelzaton of DOACROSS Loops, Proc. IEEE Supercomputng Conf., Nov. 4, IEEE CS Press, pp J. Martnez, J. Torrellas: Speculatve Locks for Concurrent Executon of Crtcal R. Mrchandaney, J. Saltz, R. M. Smth, D. M. Ncol, K. Crowley: Prncples of run-tme support for parallel processors, Proc. ACM Int. Conf. on Supercomputng, July 988, pp F. Warg and P. Stenström: Lmts on speculatve module-level parallelsm n J. Saltz and K. Crowley and R. Mrchandaney and H. Berryman: Runtme Schedulng and Executon of Loops on Message Passng Machnes, Journal on Parallel and Dstr. Computng 8 (99): J. Saltz, R. Mrchandaney: The preprocessed doacross loop. Proc. ICPP-99 Int. Conf. on Parallel Processng. S. Leung, J. Zahoran: Improvng the performance of run-tme parallelzaton. Proc. ACM PPoPP-993, pp Lawrence Rauchwerger, Davd Padua: The Prvatzng DOALL Test: A Run-Tme Technque for DOALL Loop Identfcaton and Array Prvatzaton. Proc. ACM Int. Conf. on Supercomputng, July 994, pp Lawrence Rauchwerger, Davd Padua: The LRPD Test: Speculatve Run-Tme Parallelzaton of Loops wth Prvatzaton and Reducton Parallelzaton. Proc. ACM SIGPLAN PLDI-95, 995, pp T. Vaykumar, G. Soh: Task Selecton for a Multscalar Processor. Proc. MICRO-3, Dec Sectons n Shared-Memory Multprocessors. Proc. WMPI at ISCA,. mperatve and obect-orented programs on CMP platforms. Pr. IEEE PACT. P. Marcuello and A. Gonzalez: Thread-spawnng schemes for speculatve multthreadng. Proc. HPCA-8,. J. Steffan et al.: Improvng value communcaton for thread-level speculaton. HPCA-8,. M. Cntra, J. Torrellas: Elmnatng squashes through learnng cross-thread volatons n speculatve parallelzaton for multprocessors. HPCA-8,. Fredrk Warg and Per Stenström: Improvng speculatve thread-level parallelsm through module run-length predcton. Proc. IPDPS 3. F. Warg: Technques for Reducng Thread-Level Speculaton Overhead n Chp Multprocessors. PhD thess, Chalmers TH, Gothenburg, June 6. T. Ohsawa et al.: Pnot: Speculatve mult-threadng processor archtecture explotng parallelsm over a wde range of granulartes. Proc. MICRO-38, 5. 7
Loop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)
Loop Transformatons for Parallelsm & Localty Prevously Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Loop nterchange Loop transformatons and transformaton frameworks
More informationVectorization in the Polyhedral Model
Vectorzaton n the Polyhedral Model Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty October 200 888. Introducton: Overvew Vectorzaton: Detecton
More informationLoop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation
Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop
More informationLecture 15: Memory Hierarchy Optimizations. I. Caches: A Quick Review II. Iteration Space & Loop Transformations III.
Lecture 15: Memory Herarchy Optmzatons I. Caches: A Quck Revew II. Iteraton Space & Loop Transformatons III. Types of Reuse ALSU 7.4.2-7.4.3, 11.2-11.5.1 15-745: Memory Herarchy Optmzatons Phllp B. Gbbons
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationPolyhedral Compilation Foundations
Polyhedral Complaton Foundatons Lous-Noël Pouchet pouchet@cse.oho-state.edu Dept. of Computer Scence and Engneerng, the Oho State Unversty Feb 8, 200 888., Class # Introducton: Polyhedral Complaton Foundatons
More informationLoop Transformations, Dependences, and Parallelization
Loop Transformatons, Dependences, and Parallelzaton Announcements Mdterm s Frday from 3-4:15 n ths room Today Semester long project Data dependence recap Parallelsm and storage tradeoff Scalar expanson
More informationSorting Review. Sorting. Comparison Sorting. CSE 680 Prof. Roger Crawfis. Assumptions
Sortng Revew Introducton to Algorthms Qucksort CSE 680 Prof. Roger Crawfs Inserton Sort T(n) = Θ(n 2 ) In-place Merge Sort T(n) = Θ(n lg(n)) Not n-place Selecton Sort (from homework) T(n) = Θ(n 2 ) In-place
More informationToday Using Fourier-Motzkin elimination for code generation Using Fourier-Motzkin elimination for determining schedule constraints
Fourer Motzkn Elmnaton Logstcs HW10 due Frday Aprl 27 th Today Usng Fourer-Motzkn elmnaton for code generaton Usng Fourer-Motzkn elmnaton for determnng schedule constrants Unversty Fourer-Motzkn Elmnaton
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationUncorrected Proof. Thread-Level Speculation
Encyclopeda of Parallel Computng 00170 2011/3/8 12:30 Page 1 #2 T 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 Thread-Level Josep Torrellas Unversty of Illnos
More informationAssembler. Building a Modern Computer From First Principles.
Assembler Buldng a Modern Computer From Frst Prncples www.nand2tetrs.org Elements of Computng Systems, Nsan & Schocken, MIT Press, www.nand2tetrs.org, Chapter 6: Assembler slde Where we are at: Human Thought
More informationParallel Numerics. 1 Preconditioning & Iterative Solvers (From 2016)
Technsche Unverstät München WSe 6/7 Insttut für Informatk Prof. Dr. Thomas Huckle Dpl.-Math. Benjamn Uekermann Parallel Numercs Exercse : Prevous Exam Questons Precondtonng & Iteratve Solvers (From 6)
More informationCache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access
Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?
More informationVirtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory
Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationCache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory
Topcs Lecture 4 Cache Memores Generc cache memory organzaton Drect mapped caches Set assocate caches Impact of caches on performance Cache Memores Cache memores are small, fast SRAM-based memores managed
More information2.1. The Program Model
Hyperplane Parttonng : n pproach to Global ata Parttonng for strbuted Memory Machnes S. R. Prakash and Y.. Srkant epartment of S, Indan Insttute of Scence angalore, Inda, 6 bstract utomatc Global ata Parttonng
More informationLLVM passes and Intro to Loop Transformation Frameworks
LLVM passes and Intro to Loop Transformaton Frameworks Announcements Ths class s recorded and wll be n D2L panapto. No quz Monday after sprng break. Wll be dong md-semester class feedback. Today LLVM passes
More informationELEC 377 Operating Systems. Week 6 Class 3
ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems
More informationCSCI 104 Sorting Algorithms. Mark Redekopp David Kempe
CSCI 104 Sortng Algorthms Mark Redekopp Davd Kempe Algorthm Effcency SORTING 2 Sortng If we have an unordered lst, sequental search becomes our only choce If we wll perform a lot of searches t may be benefcal
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationToday s Outline. Sorting: The Big Picture. Why Sort? Selection Sort: Idea. Insertion Sort: Idea. Sorting Chapter 7 in Weiss.
Today s Outlne Sortng Chapter 7 n Wess CSE 26 Data Structures Ruth Anderson Announcements Wrtten Homework #6 due Frday 2/26 at the begnnng of lecture Proect Code due Mon March 1 by 11pm Today s Topcs:
More informationParallel and Distributed Association Rule Mining - Dr. Giuseppe Di Fatta. San Vigilio,
Parallel and Dstrbuted Assocaton Rule Mnng - Dr. Guseppe D Fatta fatta@nf.un-konstanz.de San Vglo, 18-09-2004 1 Overvew Assocaton Rule Mnng (ARM) Apror algorthm Hgh Performance Parallel and Dstrbuted Computng
More informationConcurrent Apriori Data Mining Algorithms
Concurrent Apror Data Mnng Algorthms Vassl Halatchev Department of Electrcal Engneerng and Computer Scence York Unversty, Toronto October 8, 2015 Outlne Why t s mportant Introducton to Assocaton Rule Mnng
More information[KV99] M. Kaul and R, Vemuri. Integrated Block-Processing and Design-Space Exploration in Temporal Partitioning for RTR Architectures.
[KV99] M. Kaul and R, Vemur. Integrated Block-Processng and Desgn-Space Exploraton n Temporal Parttonng for RTR Archtectures. Reconfgurable Archtectures Workshop Proceedngs, Puerto Rco, Aprl 1999. [OCS98]
More informationWavefront Reconstructor
A Dstrbuted Smplex B-Splne Based Wavefront Reconstructor Coen de Vsser and Mchel Verhaegen 14-12-201212 2012 Delft Unversty of Technology Contents Introducton Wavefront reconstructon usng Smplex B-Splnes
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationAgenda & Reading. Simple If. Decision-Making Statements. COMPSCI 280 S1C Applications Programming. Programming Fundamentals
Agenda & Readng COMPSCI 8 SC Applcatons Programmng Programmng Fundamentals Control Flow Agenda: Decsonmakng statements: Smple If, Ifelse, nested felse, Select Case s Whle, DoWhle/Untl, For, For Each, Nested
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationGiving credit where credit is due
CSCE 23J Computer Organzaton Cache Memores Dr. Stee Goddard goddard@cse.unl.edu Gng credt where credt s due Most of sldes for ths lecture are based on sldes created by Drs. Bryant and O Hallaron, Carnege
More informationCS221: Algorithms and Data Structures. Priority Queues and Heaps. Alan J. Hu (Borrowing slides from Steve Wolfman)
CS: Algorthms and Data Structures Prorty Queues and Heaps Alan J. Hu (Borrowng sldes from Steve Wolfman) Learnng Goals After ths unt, you should be able to: Provde examples of approprate applcatons for
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationAADL : about scheduling analysis
AADL : about schedulng analyss Schedulng analyss, what s t? Embedded real-tme crtcal systems have temporal constrants to meet (e.g. deadlne). Many systems are bult wth operatng systems provdng multtaskng
More informationDijkstra s Single Source Algorithm. All-Pairs Shortest Paths. Dynamic Programming Solution. Performance. Decision Sequence.
All-Pars Shortest Paths Gven an n-vertex drected weghted graph, fnd a shortest path from vertex to vertex for each of the n vertex pars (,). Dstra s Sngle Source Algorthm Use Dstra s algorthm n tmes, once
More informationSLAM Summer School 2006 Practical 2: SLAM using Monocular Vision
SLAM Summer School 2006 Practcal 2: SLAM usng Monocular Vson Javer Cvera, Unversty of Zaragoza Andrew J. Davson, Imperal College London J.M.M Montel, Unversty of Zaragoza. josemar@unzar.es, jcvera@unzar.es,
More informationSequential search. Building Java Programs Chapter 13. Sequential search. Sequential search
Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to
More informationEstimation of Parallel Complexity with Rewriting Techniques
Estmaton of Parallel Complexty wth Rewrtng Technques Chrstophe Alas, Carsten Fuhs, Laure Gonnord INRIA & LIP (UMR CNRS/ENS Lyon/UCB Lyon1/INRIA), Lyon, France, chrstophe.alas@ens-lyon.fr Brkbeck, Unversty
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationOutline. Discriminative classifiers for image recognition. Where in the World? A nearest neighbor recognition example 4/14/2011. CS 376 Lecture 22 1
4/14/011 Outlne Dscrmnatve classfers for mage recognton Wednesday, Aprl 13 Krsten Grauman UT-Austn Last tme: wndow-based generc obect detecton basc ppelne face detecton wth boostng as case study Today:
More informationAssembler. Shimon Schocken. Spring Elements of Computing Systems 1 Assembler (Ch. 6) Compiler. abstract interface.
IDC Herzlya Shmon Schocken Assembler Shmon Schocken Sprng 2005 Elements of Computng Systems 1 Assembler (Ch. 6) Where we are at: Human Thought Abstract desgn Chapters 9, 12 abstract nterface H.L. Language
More informationVerification by testing
Real-Tme Systems Specfcaton Implementaton System models Executon-tme analyss Verfcaton Verfcaton by testng Dad? How do they know how much weght a brdge can handle? They drve bgger and bgger trucks over
More informationMidterms Save the Dates!
Unversty of Brtsh Columba CPSC, Intro to Computaton Alan J. Hu Readngs Ths Week: Ch 6 (Ch 7 n old 2 nd ed). (Remnder: Readngs are absolutely vtal for learnng ths stuff!) Thnkng About Loops Lecture 9 Some
More informationConditional Speculative Decimal Addition*
Condtonal Speculatve Decmal Addton Alvaro Vazquez and Elsardo Antelo Dep. of Electronc and Computer Engneerng Unv. of Santago de Compostela, Span Ths work was supported n part by Xunta de Galca under grant
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationIntroduction to Programming. Lecture 13: Container data structures. Container data structures. Topics for this lecture. A basic issue with containers
1 2 Introducton to Programmng Bertrand Meyer Lecture 13: Contaner data structures Last revsed 1 December 2003 Topcs for ths lecture 3 Contaner data structures 4 Contaners and genercty Contan other objects
More informationPetri Net Based Software Dependability Engineering
Proc. RELECTRONIC 95, Budapest, pp. 181-186; October 1995 Petr Net Based Software Dependablty Engneerng Monka Hener Brandenburg Unversty of Technology Cottbus Computer Scence Insttute Postbox 101344 D-03013
More informationMotivation. EE 457 Unit 4. Throughput vs. Latency. Performance Depends on View Point?! Computer System Performance. An individual user wants to:
4.1 4.2 Motvaton EE 457 Unt 4 Computer System Performance An ndvdual user wants to: Mnmze sngle program executon tme A datacenter owner wants to: Maxmze number of Mnmze ( ) http://e-tellgentnternetmarketng.com/webste/frustrated-computer-user-2/
More informationADJUSTING A PROGRAM TRANSFORMATION FOR LEGALITY
Parallel Processng Letters c World Scentfc Publshng Company ADJUSTING A PROGRAM TRANSFORMATION FOR LEGALITY CÉDRIC BASTOUL Laboratore PRSM, Unversté de Versalles Sant Quentn 45 avenue des États-Uns, 785
More informationParallel Solutions of Indexed Recurrence Equations
Parallel Solutons of Indexed Recurrence Equatons Yos Ben-Asher Dep of Math and CS Hafa Unversty 905 Hafa, Israel yos@mathcshafaacl Gad Haber IBM Scence and Technology 905 Hafa, Israel haber@hafascvnetbmcom
More informationA mathematical programming approach to the analysis, design and scheduling of offshore oilfields
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 A mathematcal programmng approach to the analyss, desgn and
More informationHarvard University CS 101 Fall 2005, Shimon Schocken. Assembler. Elements of Computing Systems 1 Assembler (Ch. 6)
Harvard Unversty CS 101 Fall 2005, Shmon Schocken Assembler Elements of Computng Systems 1 Assembler (Ch. 6) Why care about assemblers? Because Assemblers employ some nfty trcks Assemblers are the frst
More informationDijkstra s Single Source Algorithm. All-Pairs Shortest Paths. Dynamic Programming Solution. Performance
All-Pars Shortest Paths Gven an n-vertex drected weghted graph, fnd a shortest path from vertex to vertex for each of the n vertex pars (,). Dkstra s Sngle Source Algorthm Use Dkstra s algorthm n tmes,
More informationDynamic Programming. Example - multi-stage graph. sink. source. Data Structures &Algorithms II
Dynamc Programmng Example - mult-stage graph 1 source 9 7 3 2 2 3 4 5 7 11 4 11 8 2 2 1 6 7 8 4 6 3 5 6 5 9 10 11 2 4 5 12 snk Data Structures &Algorthms II A labeled, drected graph Vertces can be parttoned
More informationEfficient Distributed File System (EDFS)
Effcent Dstrbuted Fle System (EDFS) (Sem-Centralzed) Debessay(Debsh) Fesehaye, Rahul Malk & Klara Naherstedt Unversty of Illnos-Urbana Champagn Contents Problem Statement, Related Work, EDFS Desgn Rate
More informationImproving High Level Synthesis Optimization Opportunity Through Polyhedral Transformations
Improvng Hgh Level Synthess Optmzaton Opportunty Through Polyhedral Transformatons We Zuo 2,5, Yun Lang 1, Peng L 1, Kyle Rupnow 3, Demng Chen 2,3 and Jason Cong 1,4 1 Center for Energy-Effcent Computng
More informationChapter 1. Introduction
Chapter 1 Introducton 1.1 Parallel Processng There s a contnual demand for greater computatonal speed from a computer system than s currently possble (.e. sequental systems). Areas need great computatonal
More informationCS 534: Computer Vision Model Fitting
CS 534: Computer Vson Model Fttng Sprng 004 Ahmed Elgammal Dept of Computer Scence CS 534 Model Fttng - 1 Outlnes Model fttng s mportant Least-squares fttng Maxmum lkelhood estmaton MAP estmaton Robust
More informationNews. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example
Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt
More informationMachine Learning. Topic 6: Clustering
Machne Learnng Topc 6: lusterng lusterng Groupng data nto (hopefully useful) sets. Thngs on the left Thngs on the rght Applcatons of lusterng Hypothess Generaton lusters mght suggest natural groups. Hypothess
More informationAMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain
AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationCHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar
CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vdyanagar Faculty Name: Am D. Trved Class: SYBCA Subject: US03CBCA03 (Advanced Data & Fle Structure) *UNIT 1 (ARRAYS AND TREES) **INTRODUCTION TO ARRAYS If we want
More informationReal-Time Systems. Real-Time Systems. Verification by testing. Verification by testing
EDA222/DIT161 Real-Tme Systems, Chalmers/GU, 2014/2015 Lecture #8 Real-Tme Systems Real-Tme Systems Lecture #8 Specfcaton Professor Jan Jonsson Implementaton System models Executon-tme analyss Department
More informationSubspace clustering. Clustering. Fundamental to all clustering techniques is the choice of distance measure between data points;
Subspace clusterng Clusterng Fundamental to all clusterng technques s the choce of dstance measure between data ponts; D q ( ) ( ) 2 x x = x x, j k = 1 k jk Squared Eucldean dstance Assumpton: All features
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationCourse Introduction. Algorithm 8/31/2017. COSC 320 Advanced Data Structures and Algorithms. COSC 320 Advanced Data Structures and Algorithms
Course Introducton Course Topcs Exams, abs, Proects A quc loo at a few algorthms 1 Advanced Data Structures and Algorthms Descrpton: We are gong to dscuss algorthm complexty analyss, algorthm desgn technques
More informationReal-time Scheduling
Real-tme Schedulng COE718: Embedded System Desgn http://www.ee.ryerson.ca/~courses/coe718/ Dr. Gul N. Khan http://www.ee.ryerson.ca/~gnkhan Electrcal and Computer Engneerng Ryerson Unversty Overvew RTX
More informationCE 221 Data Structures and Algorithms
CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume
More informationSmoothing Spline ANOVA for variable screening
Smoothng Splne ANOVA for varable screenng a useful tool for metamodels tranng and mult-objectve optmzaton L. Rcco, E. Rgon, A. Turco Outlne RSM Introducton Possble couplng Test case MOO MOO wth Game Theory
More informationSystem-on-Chip Design Analysis of Control Data Flow. Hao Zheng Comp Sci & Eng U of South Florida
System-on-Chp Desgn Analyss of Control Data Flow Hao Zheng Comp Sc & Eng U of South Florda Overvew DF models descrbe concurrent computa=on at a very hgh level Each actor descrbes non-trval computa=on.
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationClassifier Selection Based on Data Complexity Measures *
Classfer Selecton Based on Data Complexty Measures * Edth Hernández-Reyes, J.A. Carrasco-Ochoa, and J.Fco. Martínez-Trndad Natonal Insttute for Astrophyscs, Optcs and Electroncs, Lus Enrque Erro No.1 Sta.
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationA MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS
Proceedngs of the Wnter Smulaton Conference M E Kuhl, N M Steger, F B Armstrong, and J A Jones, eds A MOVING MESH APPROACH FOR SIMULATION BUDGET ALLOCATION ON CONTINUOUS DOMAINS Mark W Brantley Chun-Hung
More informationHierarchical clustering for gene expression data analysis
Herarchcal clusterng for gene expresson data analyss Gorgo Valentn e-mal: valentn@ds.unm.t Clusterng of Mcroarray Data. Clusterng of gene expresson profles (rows) => dscovery of co-regulated and functonally
More informationOutline. Digital Systems. C.2: Gates, Truth Tables and Logic Equations. Truth Tables. Logic Gates 9/8/2011
9/8/2 2 Outlne Appendx C: The Bascs of Logc Desgn TDT4255 Computer Desgn Case Study: TDT4255 Communcaton Module Lecture 2 Magnus Jahre 3 4 Dgtal Systems C.2: Gates, Truth Tables and Logc Equatons All sgnals
More informationBrave New World Pseudocode Reference
Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be
More informationProgramming FPGAs in C/C++ with High Level Synthesis PACAP - HLS 1
Programmng FPGAs n C/C wth Hgh Level Synthess PACAP - HLS 1 Outlne Why Hgh Level Synthess? Challenges when syntheszng hardware from C/C Hgh Level Synthess from C n a nutshell Explorng performance/area
More informationK-means and Hierarchical Clustering
Note to other teachers and users of these sldes. Andrew would be delghted f you found ths source materal useful n gvng your own lectures. Feel free to use these sldes verbatm, or to modfy them to ft your
More informationEstimating Costs of Path Expression Evaluation in Distributed Object Databases
Estmatng Costs of Path Expresson Evaluaton n Dstrbuted Obect Databases Gabrela Ruberg, Fernanda Baão, and Marta Mattoso Department of Computer Scence COPPE/UFRJ P.O.Box 685, Ro de Janero, RJ, 2945-970
More information4/11/17. Agenda. Princeton University Computer Science 217: Introduction to Programming Systems. Goals of this Lecture. Storage Management.
//7 Prnceton Unversty Computer Scence 7: Introducton to Programmng Systems Goals of ths Lecture Storage Management Help you learn about: Localty and cachng Typcal storage herarchy Vrtual memory How the
More information11. APPROXIMATION ALGORITHMS
Copng wth NP-completeness 11. APPROXIMATION ALGORITHMS load balancng center selecton prcng method: vertex cover LP roundng: vertex cover generalzed load balancng knapsack problem Q. Suppose I need to solve
More informationLobachevsky State University of Nizhni Novgorod. Polyhedron. Quick Start Guide
Lobachevsky State Unversty of Nzhn Novgorod Polyhedron Quck Start Gude Nzhn Novgorod 2016 Contents Specfcaton of Polyhedron software... 3 Theoretcal background... 4 1. Interface of Polyhedron... 6 1.1.
More information7/12/2016. GROUP ANALYSIS Martin M. Monti UCLA Psychology AGGREGATING MULTIPLE SUBJECTS VARIANCE AT THE GROUP LEVEL
GROUP ANALYSIS Martn M. Mont UCLA Psychology NITP AGGREGATING MULTIPLE SUBJECTS When we conduct mult-subject analyss we are tryng to understand whether an effect s sgnfcant across a group of people. Whether
More informationAlgorithmic Transformation Techniques for Efficient Exploration of Alternative Application Instances
In: Proc. 0th Int. Symposum on Hardware/Software Codesgn (CODES 02), Estes Park, Colorado, USA, May 6 8, 2002 Algorthmc Transformaton Technques for Effcent Exploraton of Alternatve Applcaton Instances
More informationCS434a/541a: Pattern Recognition Prof. Olga Veksler. Lecture 15
CS434a/541a: Pattern Recognton Prof. Olga Veksler Lecture 15 Today New Topc: Unsupervsed Learnng Supervsed vs. unsupervsed learnng Unsupervsed learnng Net Tme: parametrc unsupervsed learnng Today: nonparametrc
More informationMeta-heuristics for Multidimensional Knapsack Problems
2012 4th Internatonal Conference on Computer Research and Development IPCSIT vol.39 (2012) (2012) IACSIT Press, Sngapore Meta-heurstcs for Multdmensonal Knapsack Problems Zhbao Man + Computer Scence Department,
More informationOptimizing for Speed. What is the potential gain? What can go Wrong? A Simple Example. Erik Hagersten Uppsala University, Sweden
Optmzng for Speed Er Hagersten Uppsala Unversty, Sweden eh@t.uu.se What s the potental gan? Latency dfference L$ and mem: ~5x Bandwdth dfference L$ and mem: ~x Repeated TLB msses adds a factor ~-3x Execute
More informationCluster Analysis of Electrical Behavior
Journal of Computer and Communcatons, 205, 3, 88-93 Publshed Onlne May 205 n ScRes. http://www.scrp.org/ournal/cc http://dx.do.org/0.4236/cc.205.350 Cluster Analyss of Electrcal Behavor Ln Lu Ln Lu, School
More informationContent Based Image Retrieval Using 2-D Discrete Wavelet with Texture Feature with Different Classifiers
IOSR Journal of Electroncs and Communcaton Engneerng (IOSR-JECE) e-issn: 78-834,p- ISSN: 78-8735.Volume 9, Issue, Ver. IV (Mar - Apr. 04), PP 0-07 Content Based Image Retreval Usng -D Dscrete Wavelet wth
More informationPerformance Study of Parallel Programming on Cloud Computing Environments Using MapReduce
Performance Study of Parallel Programmng on Cloud Computng Envronments Usng MapReduce Wen-Chung Shh, Shan-Shyong Tseng Department of Informaton Scence and Applcatons Asa Unversty Tachung, 41354, Tawan
More informationInternational Conference on Parallel Processing, St. Charles, IL, August COMMUNICATION OPTIMIZATIONS USED IN THE PARADIGM
Internatonal Conference on Parallel Processng, St. Charles, IL, August 199 1 COMMCATION OPTIMIZATIONS USED IN THE PARADIGM COMPILER FOR DISTRIBUTED-MEMORY MULTICOMPUTERS Danel J. Palermo, Ernesto Su, John
More informationAn Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices
Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal
More informationOutline. Type of Machine Learning. Examples of Application. Unsupervised Learning
Outlne Artfcal Intellgence and ts applcatons Lecture 8 Unsupervsed Learnng Professor Danel Yeung danyeung@eee.org Dr. Patrck Chan patrckchan@eee.org South Chna Unversty of Technology, Chna Introducton
More informationRange images. Range image registration. Examples of sampling patterns. Range images and range surfaces
Range mages For many structured lght scanners, the range data forms a hghly regular pattern known as a range mage. he samplng pattern s determned by the specfc scanner. Range mage regstraton 1 Examples
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationConcurrent models of computation for embedded software
Concurrent models of computaton for embedded software and hardware! Researcher overvew what t looks lke semantcs what t means and how t relates desgnng an actor language actor propertes and how to represent
More information