Engineering Parallel Software with
|
|
- Helena Newman
- 6 years ago
- Views:
Transcription
1 Engineering Parallel Software with Our Pattern Language Profeor Kurt Keutzer and Tim Matton and (Jike Chong), Ekaterina Gonina, Bor-Yiing Su and Michael Anderon, Bryan Catanzaro, Chao-Yue Lai, Mark Murphy, David Sheffield, Naryanan Sundaram, 1/77 Main Point of Previou Lecture Many approache to parallelizing oftware are not working Profile and improve Swap in a new parallel programming language Rely on a uper parallelizing compiler My own experience ha hown that a ound oftware architecture i the greatet ingle indicator of a oftware project ucce. Software mut be architected to achieve productivity, efficiency, and correctne SW architecture >> programming environment >> programming language >> compiler and debugger (>>hardware architecture) If we had undertood how to architect equential oftware, then parallelizing oftware would not have been uch a challenge Key to architecture (oftware or otherwie) i deign pattern and a pattern language At the highet level our pattern language ha: Eight tructural pattern Thirteen computational pattern Ye, we really believe arbitrarily complex parallel oftware can built jut from thee! 2/77 Kurt Keutzer 1
2 Outline Our Pattern Language for parallel programming Detailed example uing Our Pattern Language 3/77 Alexander Pattern Language Chritopher Alexander approach to (civil) architecture: "Each pattern decribe a problem which occur over and over again in our environment, and then decribe the core of the olution to that problem, in uch a way that you can ue thi olution a million time over, without ever doing it the ame way twice. Page x, A Pattern Language, Chritopher Alexander Alexander 253 (civil) architectural pattern range from the creation of citie (2. ditribution of town) to particular building problem (232. roof cap) A pattern language i an organized way of tackling an architectural problem uing pattern Main limitation: It about civil not oftware architecture!!! 4 4/77 Kurt Keutzer 2
3 A Pattern Language Pattern embody generalizable olution to recurrent problem Collection of individual pattern (e.g. Deign Pattern, Gamma, Helm, Johnon, Vliide) are great, but we need more help in the oftware development enterprie We would like a comprehenive pattern language which cover the entire proce of parallel oftware development and implementation Keutzer, Matton, the PALLAS group, and other have developed jut uch a pattern language Pattern language i overviewed in: Today we don t have time to go through all the pattern, but we will briefly decribe the tructure of OPL and how how it can be ued 5/77 Our Pattern Language: At a High Level Application Structural Pattern Computational Pattern Implementation Pattern Execution Pattern 6/77 Kurt Keutzer 3
4 Architecting Parallel Software Application Identify the Software Structure Pipe-and-Filter Agent-and-Repoitory Event-baed Bulk Synchronou MapReduce Layered Sytem Model-view controller Arbitrary Tak Graph Puppeteer Model-view-controller Identify the Key Computation Graph Algorithm Dynamic programming Dene/Spare Linear Algebra Un/Structured Grid Graphical Model Finite State Machine Backtrack Branch-and-Bound N-Body Method Circuit Spectral Method Monte-Carlo 7/77 Our Pattern Language 2010: Detail Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction 8/77 Kurt Keutzer 4
5 Our Pattern Language: At a High Level Application Structural Pattern Computational Pattern Implementation Pattern Execution Pattern 9/77 The Algorithm Strategy Deign Space Start Organize By Flow of Data Organize By Tak Organize By Data Regular? Irregular? Linear? Recurive? Linear? Recurive? Dicrete Event Tak Parallelim Divide and Conquer Geometric Decompoition??? Tak-Parallelim Data-Parallelim Divide and Conquer Dicrete-Event Geometric-Decompoition 10/77 Kurt Keutzer 5
6 Our Pattern Language: At a High Level Application Structural Pattern Computational Pattern Implementation Pattern Execution Pattern Implementation Strategy Pattern SPMD Fork/Join Data-Par/index-pace Actor Program tructure Loop-Par. Tak-Queue Shared-Queue Shared-map Partitioned Graph Ditributed-Array Shared-Data Data tructure 11/77 Our Pattern Language: At a High Level Application Structural Pattern Computational Pattern Implementation Pattern Execution Pattern Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction 12/77 Kurt Keutzer 6
7 Outline Our Pattern Language for parallel programming Detailed example uing Our Pattern Language 13/77 Speech Recognition Application Software Architecture uing Pattern Identify Structural Pattern Identify Computational Pattern Parallelization: (for each module) Algorithm trategy pattern Implementation trategy pattern Execution pattern Concluion Outline 14/77 Kurt Keutzer 7
8 Automatic Speech Recognition Key technology for enabling rich human-computer interaction Increaingly important for intelligent device without keyboard Interaction require low latency repone Only one of many component in exciting new real-time application Main contribution: A detailed analyi of the concurrency in large vocabulary continuou peech recognition (LVCSR) ummarized in the pattern language An implementation on GPU with 11x peedup over optimized C++ verion Achieving 3x better than real time performance with 50,000 word vocabulary 15/77 Continuou Speech Recognition Challenge: Recognizing word from a large vocabulary arranged in exponentially many poible permutation Inferring word boundarie from the context of neighboring word Hidden Markov Model (HMM) i the mot ucceful approach 16/77 Kurt Keutzer 8
9 CSR Sytem Structure Inference engine baed ytem Ued in Sphinx (CMU, USA), HTK (Cambridge, UK), and Juliu (CSRC, Japan) Modular and flexible etup Shown to be effective for Arabic, Englih, Japanee, and Mandarin 17/77 Recognition Proce Conider a equence of feature one at a time, for each feature: Start with mot-likely-equence ending with previou feature Correponding to a et of active tate Conider the likelihood of current feature in it context Compute mot-likely-equence ending with current feature Correponding to the next et of active tate Recognition i a proce of graph traveral Recognition Network 18/77 Kurt Keutzer 9
10 Recognition Network Feature from one frame Gauian Mixture Model for One Phone State Mixture Component Computing ditance to each mixture component Computing weighted um of all component HMM Acoutic Phone Model aa hh n Pronunciation Model... HOP hh aa p... ON aa n... POP p aa p... Bigram Language Model T P CAT HAT HO IN ON PO P THE Compiled HMM Recognition Network... CAT... HAT HOP IN... ON POP... THE... 19/77 Graph Traveral Characteritic Compiled HMM Recognition Network Linear Lexical Network WFST Recognition Network Operation on a graph i driven by the graph tructure Irregular graph tructure challenging to tatically load balancing Irregular tate acce pattern challenging to dynamically load balancing Multiple tate may hare the ame next tate write contention in arc evaluation Traveral produce memory accee that pan the entire graph poor patial locality Inner loop have high data acce to computation ratio limited by memory bandwidth 20/77 Kurt Keutzer 10
11 Parallel Platform Characteritic Multicore/manycore deign philoophy Multicore: Devote ignificant tranitor reource to ingle thread performance Core Manycore: Maximizing computation throughput at the expene of ingle thread performance Architecture Trend: Increaing vector unit width Increaing number of core per die Core Core Cach Cach e e Cac ch Cach e e Core Core Application Implication: Mut increae data acce regularity Mut optimize ynchronization cot Core Cach e Cach e Core 21/77 Outline Speech Recognition Application Software Architecture uing Pattern Identify Structural Pattern Identify Computational Pattern Parallelization: (for each module) Algorithm trategy pattern Implementation trategy pattern Execution pattern Concluion 22/77 Kurt Keutzer 11
12 OPL 2010 Application Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction Concurrency Foundation contruct (not expreed a pattern) Thread creation/detruction Proce creation/detruction Meage-Paing Collective-Comm. Tranactional memory Point-To-Point-Sync. (mutual excluion) collective ync. (barrier) Memory ync/fence 23 23/77 Architecting Parallel Software Decompoe Tak Group tak Order Tak Decompoe Data Data haring Data acce Identify the Software Identify the Key Structure Computation 24/77 Kurt Keutzer 12
13 Inference Engine Architecture Intro Recognition i a proce of graph traveral Each time-tep we need to identify the likely tate in the recognition network given the obervation acoutic ignal From a the of active tate we want to compute the next et of active tate uing probabilitie of acoutic ymbol and tate tranition What Structural pattern i thi? 25/77 Key computation: HMM Inference Algorithm An intance of: Graphical Model Implemented with: Dynamic Programming Find the mot-likely equence of tate that produced the obervation Viterbi Algorithm t State 1 State 2 State 3 State 4 Ob 1 Ob 2 Ob 3 Ob 4 x x x x Legend: A State x An Obervation P( x t t ) m [t-1][ t-1 ] P( t t-1 ) m [t][ t ] Markov Condition: J. Chong, Y. Yi, A. Faria, N.R. Satih and K. Keutzer, Data-Parallel Large Vocabulary Continuou Speech Recognition on Graphic Proceor, Emerging Application and Manycore Arch. 2008, pp , June /77 Kurt Keutzer 13
14 Iterative Refinement Structural Pattern One iteration per time tep Identify the et of probable tate in the network given acoutic ignal given current active tate et Prune unlikely tate Repeat 27/77 Digging Deeper Active Set Computation Architecture In each iteration we need to: Compute obervation probabilitie of tranition from current tate Travere the likely non-epilon arc to reach the et of next active tate t Travere the likely epilon arc to reach the et of next active tate What Structural pattern i thi? 28/77 Kurt Keutzer 14
15 Digging Deeper Active Set Computation Architecture Phae 1 Obervation probability computation Graphtraveral Phae 2 In each iteration we need to: Compute obervation probabilitie of tranition from current tate Travere the likely non-epilon arc to reach the et of next active tate Travere the likely epilon arc to reach the et of next active tate What Structural pattern i thi? 29/77 Phae 1: Obervation Probability computation Architecture Obervation probabilitie are computed from Gauian Mixture Model Each Gauian probability in each mixture i independent Probability for one phone tate i the um of all Gauian time the mixture probability for that tate What Structural pattern i thi? Dan Klein CS288, Lecture 9 30/77 Kurt Keutzer 15
16 Map-Reduce Structural Pattern Map each mixture probability computation Reduce the reult accumulate the total probability for that tate Gauian Mixture Model for One Phone State Mixture Component 31/77 Phae 2: Graph Traveral Now that we know the tranition probabilitie from current et of tate, we need to compute the next et of active tate (follow the likely tranition) Each tranition i independent Multiple tranition might end in the ame tate The end reult need to be a et of mot probable tate from all tranition What Structural pattern i thi? 32/77 Kurt Keutzer 16
17 Map-Reduce Structural Pattern- again Map each mixture probability computation Reduce the reult accumulate the total probability for that tate 33/77 High Level Structure of Engine Inference Engine Active Set Computation Inference Engine Active Set Computation Phae 1: Obervation Probability Computation Phae 2: Graph Traveral Phae 1 Phae 2 34/77 Kurt Keutzer 17
18 Outline Speech Recognition Application Software Architecture uing Pattern Identify Structural Pattern Identify Computational Pattern Parallelization: (for each module) Algorithm trategy pattern Implementation trategy pattern Execution pattern Concluion 35/77 What about Computation? Active Set Computation: Phae 1: Compute obervation probability of tranition given current et of tate Phae 2: Travere arc to determine next et of mot likely active tate What Computational Pattern are thee? Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo 36/77 Kurt Keutzer 18
19 Outline Speech Recognition Application Software Architecture uing Pattern Identify Structural Pattern Identify Computational Pattern Parallelization: (for each module) Algorithm trategy pattern Implementation trategy pattern Execution pattern Concluion 37/77 Now to Parallelim Inference Engine Structural Pattern: Iterative Refinement Computational Pattern: Dynamic Programming Inference engine and Active Set computation i equential Let look at Phae 1 and Phae 2 What Parallel Algorithm Strategy can we ue? Tak-Parallelim Data-Parallelim Divide and Conquer Dicrete-Event Geometric-Decompoition 38/77 Kurt Keutzer 19
20 Now to Parallelim Inference Engine Phae 1: Obervation Probability Computation Structural: MapReduce Computational: Graphical Model Compute cluter Gauian Mixture Probabilitie for each tranition label Hint: Look at data dependencie (or lack there-of) Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition 39/77 Now to Parallelim Inference Engine Phae 1: Obervation Probability Computation Structural: MapReduce Computational: Graphical Model Compute cluter Gauian Mixture Probabilitie for each tranition label Map reduce necearily implie dataparallelim Tak-Parallelim Data-Parallelim Divide and Conquer Dicrete-Event Geometric-Decompoition 40/77 Kurt Keutzer 20
21 Obervation Probability Pattern Computation Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction Implementation trategy: Data tructure? 41/77 Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Obervation Probability Pattern Decompoition Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction Gauian Mixture Model i hared among all computation read only 42/77 Kurt Keutzer 21
22 Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Obervation Probability Pattern Decompoition Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction Program tructure? 43/77 Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Obervation Probability Pattern Decompoition Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction Data parallel a ingle intruction tream i applied to multiple data element 44/77 Kurt Keutzer 22
23 Obervation Probability Pattern Decompoition Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction Execution? a ingle intruction tream i applied to multiple data element 45/77 Obervation Probability Pattern Decompoition Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction SIMD 46/77 Kurt Keutzer 23
24 Now to Parallelim Pt2 Inference Engine Phae 2: Graph Traveral Structural: MapReduce Computational: Graph algorithm/graph traveral The recognition network i a finite tate tranducer, repreented a a weighted and labeled graph Decoding on thi graph i Breadth- Firt Traveral What Parallel Algorithmic Strategy can we ue? Hint: What are the operand? What are the dependence? Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition 47/77 Now to Parallelim Inference Engine Phae 2: Graph Traveral Structural: MapReduce Computational: Graph Traveral The recognition network i a finite tate tranducer, repreented a a weighted and labeled graph Decoding on thi graph i Breadth- Firt Traveral Data Parallelim! Each next tate computation can be computed independently. Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition 48/77 Kurt Keutzer 24
25 Active Set Computation Structural Pattern Pipe-and-Filter Agent-and-Repoitory Proce-Control Event-Baed/Implicit- Invocation Puppeteer Model-View-Controller Iterative-Refinement Map-Reduce Layered-Sytem Arbitrary-Static-Tak-Graph Computational Pattern Graph-Algorithm Dynamic-Programming Dene-Linear-Algebra Spare-Linear-Algebra Untructured-Grid Structured-Grid Graphical-Model Finite-State-Machine Backtrack-Branch-and- Bound N-Body-Method Circuit Spectral-Method Monte-Carlo Tak-Parallelim Divide and Conquer Data-Parallelim Dicrete-Event Geometric-Decompoition Implementation Strategy Pattern Shared-Queue SPMD Fork/Join Loop-Par. Ditributed-Array Shared-map Data-Par/index-pace Actor Tak-Queue Shared-Data Partitioned Graph Program tructure Data tructure Parallel Execution Pattern MIMD SIMD Thread-Pool Tak-Graph Tranaction SIMD 49/77 Shared Queue Implementation In each iteration: Arc tranition from the current et of tate are travered to contruct a et of next active tate Each thread enqueue one or more next active tate to a global queue of next active tate Thi approach caue ignificant contention a the queue head pointer In our application - thouand of thread are acceing one memory location Caue erialization of thread memory accee Limit calability 50/77 Kurt Keutzer 25
26 Shared Queue Implementation Solution to the queue head pointer contention -> ditributed queue 1. Each thread in a thread block write to a local queue of next active tate 2. Local queue are merged into a global queue Contention on global queue pointer i reduced from #thread to #block Significantly improve calability 51/77 Speech Reference Implementation Jike Chong, Ekaterina Gonina, Youngmin Yi, Kurt Keutzer, A Fully Data Parallel WFST-baed Large Vocabulary Continuou Speech Recognition on a Graphic Proceing Unit, Proceeding of the 10th Annual Conference of the International Speech Communication Aociation (InterSpeech), page , September, W R W R W R Manycore GPU Data W R R W R Control Phae 0 Iteration Control Prepare ActiveSet Phae 1 Compute Obervation Probability Phae 2 For each active arc: Compute arc tranition probability Copy reult back to CPU Control Read File Initialize data tructure Collect Backtrack Info Backtrack Output Reult CPU Data W R W R Kiun You, Jike Chong, Youngmin Yi, Ekaterina Gonina, Chritopher Hughe, Yen-Kuang Chen, Wonyong Sung, Kurt Keutzer, Parallel Scalability in Speech Recognition: Inference engine in large vocabulary continuou peech recognition, IEEE Signal Proceing Magazine, vol. 26, no. 6, pp , November Jike Chong, Ekaterina Gonina, Kiun You, Kurt Keutzer, Scalable Parallelization of Automatic Speech Recognition, Invited book chapter in Scaling Up Machine Learning, an upcoming 2010 Cambridge Univerity Pre book. 52/77 Kurt Keutzer 26
27 Summary A good architect need to undertand: Structural pattern Computational pattern Refinement through Our Pattern Language Graph algorithm and graphical model are critical to many application Graph algorithm are epecially difficult to parallelize and library upport i inadequate There will be at leat a decade of hand-crafted olution We achieved good reult on parallelizing large-vocabulary automatic peech recognition 53/77 Extra 54/77 Kurt Keutzer 27
28 Example: Breadth Firt Search Breadth firt earch Ditributed Graph Ditributed Queue Ditributed Viitor Ditributed Property Map Algorithm Viitor 55/77 Example: Ditributed BFS Ditributed Graph (a) Ditributed Graph (b) Ditributed adjacency lit repreentation 56/77 Kurt Keutzer 28
29 Example: Ditributed BFS Ditributed Queue pop() from local queue puh() end meage to the vertex owner empty() exhaut local queue and ynchronize with other proceor to determine termination condition Meage are only received after all proceor have completed operation at one level (b) Ditributed adjacency lit repreentation 57 57/77 Example: Ditributed BFS Ditributed Viitor Owner-compute cheme, o ditributed verion i the ame a equential viitor Ditributed Property Map Local Property Map Store local propertie for local vertice and edge Ghot Cell Store propertie for ghot cell Proce Group Communication medium Data Race Reolver Decide among variou put() meage ent to the Ditributed Property Map (a) Ditributed Graph (b) Ditributed adjacency lit repreentation 58 58/77 Kurt Keutzer 29
30 Performance achieved 128 node, 2GHz AMD Opteron, 4GB RAM each, connected over Infiniband, one core active per node BFS: 100k vertice and ~15M edge (C) Crauer et al (E) Eager heuritic BFS: 1M vertice and ~15M edge Dijktra Algorithm: 100k vertice and ~15M edge 59 59/77 Dicuion Graph traveral algorithm characteritic: 1. Input data-driven computation 2. Untructured problem 3. Poor data locality 4. High data acce to computation ratio High demand for low memory latency and poor data locality make it challenging for PE without fine-grain multiproceing Pointer chaing mainly involve integer operation Niagara type platform eem uitable for thi application domain What about GPGPU? What how well would our cheduling algorithm map to Parallel BGL? 60 60/77 Kurt Keutzer 30
Third party names are the property of their owners.
Disclaimer READ THIS its very important The views expressed in this talk are those of the speaker and not his employer. This is an academic style talk and does not address details of any particular Intel
More informationLecture 14: Minimum Spanning Tree I
COMPSCI 0: Deign and Analyi of Algorithm October 4, 07 Lecture 4: Minimum Spanning Tree I Lecturer: Rong Ge Scribe: Fred Zhang Overview Thi lecture we finih our dicuion of the hortet path problem and introduce
More informationGPU-Accelerated Large Vocabulary Continuous Speech Recognition for Scalable Distributed Speech Recognition. Jungsuk Kim Ian Lane
GPU-Accelerated Large Vocabulary Continuou Speech Recognition for Scalable Ditributed Speech Recognition Junguk Kim Ian Lane Electrical and Computer Engineering Carnegie Mellon Univerity March 20, 2015
More informationThe Association of System Performance Professionals
The Aociation of Sytem Performance Profeional The Computer Meaurement Group, commonly called CMG, i a not for profit, worldwide organization of data proceing profeional committed to the meaurement and
More informationContents. shortest paths. Notation. Shortest path problem. Applications. Algorithms and Networks 2010/2011. In the entire course:
Content Shortet path Algorithm and Network 21/211 The hortet path problem: Statement Verion Application Algorithm (for ingle ource p problem) Reminder: relaxation, Dijktra, Variant of Dijktra, Bellman-Ford,
More informationGeneric Traverse. CS 362, Lecture 19. DFS and BFS. Today s Outline
Generic Travere CS 62, Lecture 9 Jared Saia Univerity of New Mexico Travere(){ put (nil,) in bag; while (the bag i not empty){ take ome edge (p,v) from the bag if (v i unmarked) mark v; parent(v) = p;
More information3D SMAP Algorithm. April 11, 2012
3D SMAP Algorithm April 11, 2012 Baed on the original SMAP paper [1]. Thi report extend the tructure of MSRF into 3D. The prior ditribution i modified to atify the MRF property. In addition, an iterative
More informationBuilding a Compact On-line MRF Recognizer for Large Character Set using Structured Dictionary Representation and Vector Quantization Technique
202 International Conference on Frontier in Handwriting Recognition Building a Compact On-line MRF Recognizer for Large Character Set uing Structured Dictionary Repreentation and Vector Quantization Technique
More informationToday s Outline. CS 561, Lecture 23. Negative Weights. Shortest Paths Problem. The presence of a negative cycle might mean that there is
Today Outline CS 56, Lecture Jared Saia Univerity of New Mexico The path that can be trodden i not the enduring and unchanging Path. The name that can be named i not the enduring and unchanging Name. -
More informationShortest Paths Problem. CS 362, Lecture 20. Today s Outline. Negative Weights
Shortet Path Problem CS 6, Lecture Jared Saia Univerity of New Mexico Another intereting problem for graph i that of finding hortet path Aume we are given a weighted directed graph G = (V, E) with two
More information1 The secretary problem
Thi i new material: if you ee error, pleae email jtyu at tanford dot edu 1 The ecretary problem We will tart by analyzing the expected runtime of an algorithm, a you will be expected to do on your homework.
More informationThe Data Locality of Work Stealing
The Data Locality of Work Stealing Umut A. Acar School of Computer Science Carnegie Mellon Univerity umut@c.cmu.edu Guy E. Blelloch School of Computer Science Carnegie Mellon Univerity guyb@c.cmu.edu Robert
More informationAdvanced Encryption Standard and Modes of Operation
Advanced Encryption Standard and Mode of Operation G. Bertoni L. Breveglieri Foundation of Cryptography - AES pp. 1 / 50 AES Advanced Encryption Standard (AES) i a ymmetric cryptographic algorithm AES
More informationStochastic Search and Graph Techniques for MCM Path Planning Christine D. Piatko, Christopher P. Diehl, Paul McNamee, Cheryl Resch and I-Jeng Wang
Stochatic Search and Graph Technique for MCM Path Planning Chritine D. Piatko, Chritopher P. Diehl, Paul McNamee, Cheryl Rech and I-Jeng Wang The John Hopkin Univerity Applied Phyic Laboratory, Laurel,
More informationKeywords Cloud Computing, Service Level Agreements (SLA), CloudSim, Monitoring & Controlling SLA Agent, JADE
Volume 5, Iue 8, Augut 2015 ISSN: 2277 128X International Journal of Advanced Reearch in Computer Science and Software Engineering Reearch Paper Available online at: www.ijarce.com Verification of Agent
More informationOperational Semantics Class notes for a lecture given by Mooly Sagiv Tel Aviv University 24/5/2007 By Roy Ganor and Uri Juhasz
Operational emantic Page Operational emantic Cla note for a lecture given by Mooly agiv Tel Aviv Univerity 4/5/7 By Roy Ganor and Uri Juhaz Reference emantic with Application, H. Nielon and F. Nielon,
More informationLaboratory Exercise 6
Laboratory Exercie 6 Adder, Subtractor, and Multiplier a a The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each b c circuit will be decribed in Verilog
More informationAUTOMATIC TEST CASE GENERATION USING UML MODELS
Volume-2, Iue-6, June-2014 AUTOMATIC TEST CASE GENERATION USING UML MODELS 1 SAGARKUMAR P. JAIN, 2 KHUSHBOO S. LALWANI, 3 NIKITA K. MAHAJAN, 4 BHAGYASHREE J. GADEKAR 1,2,3,4 Department of Computer Engineering,
More informationHow to. write a paper. The basics writing a solid paper Different communities/different standards Common errors
How to write a paper The baic writing a olid paper Different communitie/different tandard Common error Reource Raibert eay My grammar point Article on a v. the Bug in writing Clarity Goal Conciene Calling
More informationIncreasing Throughput and Reducing Delay in Wireless Sensor Networks Using Interference Alignment
Int. J. Communication, Network and Sytem Science, 0, 5, 90-97 http://dx.doi.org/0.436/ijcn.0.50 Publihed Online February 0 (http://www.scirp.org/journal/ijcn) Increaing Throughput and Reducing Delay in
More informationLecture Outline. Global flow analysis. Global Optimization. Global constant propagation. Liveness analysis. Local Optimization. Global Optimization
Lecture Outline Global flow analyi Global Optimization Global contant propagation Livene analyi Adapted from Lecture by Prof. Alex Aiken and George Necula (UCB) CS781(Praad) L27OP 1 CS781(Praad) L27OP
More informationFactor Graphs and Inference
Factor Graph and Inerence Sargur Srihari rihari@cedar.bualo.edu 1 Topic 1. Factor Graph 1. Factor in probability ditribution. Deriving them rom graphical model. Eact Inerence Algorithm or Tree graph 1.
More informationLecture 8: More Pipelining
Overview Lecture 8: More Pipelining David Black-Schaffer davidbb@tanford.edu EE8 Spring 00 Getting Started with Lab Jut get a ingle pixel calculating at one time Then look into filling your pipeline Multiplier
More informationDistributed Packet Processing Architecture with Reconfigurable Hardware Accelerators for 100Gbps Forwarding Performance on Virtualized Edge Router
Ditributed Packet Proceing Architecture with Reconfigurable Hardware Accelerator for 100Gbp Forwarding Performance on Virtualized Edge Router Satohi Nihiyama, Hitohi Kaneko, and Ichiro Kudo Abtract To
More informationMarkov Random Fields in Image Segmentation
Preented at SSIP 2011, Szeged, Hungary Markov Random Field in Image Segmentation Zoltan Kato Image Proceing & Computer Graphic Dept. Univerity of Szeged Hungary Zoltan Kato: Markov Random Field in Image
More informationPerformance of a Robust Filter-based Approach for Contour Detection in Wireless Sensor Networks
Performance of a Robut Filter-baed Approach for Contour Detection in Wirele Senor Network Hadi Alati, William A. Armtrong, Jr., and Ai Naipuri Department of Electrical and Computer Engineering The Univerity
More informationAnalyzing Hydra Historical Statistics Part 2
Analyzing Hydra Hitorical Statitic Part Fabio Maimo Ottaviani EPV Technologie White paper 5 hnode HSM Hitorical Record The hnode i the hierarchical data torage management node and ha to perform all the
More informationUniversität Augsburg. Institut für Informatik. Approximating Optimal Visual Sensor Placement. E. Hörster, R. Lienhart.
Univerität Augburg à ÊÇÅÍÆ ËÀǼ Approximating Optimal Viual Senor Placement E. Hörter, R. Lienhart Report 2006-01 Januar 2006 Intitut für Informatik D-86135 Augburg Copyright c E. Hörter, R. Lienhart Intitut
More informationExercise 4: Markov Processes, Cellular Automata and Fuzzy Logic
Exercie 4: Marko rocee, Cellular Automata and Fuzzy Logic Formal Method II, Fall Semeter 203 Solution Sheet Marko rocee Theoretical Exercie. (a) ( point) 0.2 0.7 0.3 tanding 0.25 lying 0.5 0.4 0.2 0.05
More informationCS201: Data Structures and Algorithms. Assignment 2. Version 1d
CS201: Data Structure and Algorithm Aignment 2 Introduction Verion 1d You will compare the performance of green binary earch tree veru red-black tree by reading in a corpu of text, toring the word and
More informationLinkGuide: Towards a Better Collection of Hyperlinks in a Website Homepage
Proceeding of the World Congre on Engineering 2007 Vol I LinkGuide: Toward a Better Collection of Hyperlink in a Webite Homepage A. Ammari and V. Zharkova chool of Informatic, Univerity of Bradford anammari@bradford.ac.uk,
More informationRouting Definition 4.1
4 Routing So far, we have only looked at network without dealing with the iue of how to end information in them from one node to another The problem of ending information in a network i known a routing
More informationDigifort Standard. Architecture
Digifort Standard Intermediate olution for intalling up to 32 camera The Standard verion provide the ideal reource for local and remote monitoring of up to 32 camera per erver and a the intermediate verion
More informationES205 Analysis and Design of Engineering Systems: Lab 1: An Introductory Tutorial: Getting Started with SIMULINK
ES05 Analyi and Deign of Engineering Sytem: Lab : An Introductory Tutorial: Getting Started with SIMULINK What i SIMULINK? SIMULINK i a oftware package for modeling, imulating, and analyzing dynamic ytem.
More informationCourse Project: Adders, Subtractors, and Multipliers a
In the name Allah Department of Computer Engineering 215 Spring emeter Computer Architecture Coure Intructor: Dr. Mahdi Abbai Coure Project: Adder, Subtractor, and Multiplier a a The purpoe of thi p roject
More informationA New Approach to Pipeline FFT Processor
A ew Approach to Pipeline FFT Proceor Shouheng He and Mat Torkelon Department of Applied Electronic, Lund Univerity S- Lund, SWEDE email: he@tde.lth.e; torkel@tde.lth.e Abtract A new VLSI architecture
More informationKey Terms - MinMin, MaxMin, Sufferage, Task Scheduling, Standard Deviation, Load Balancing.
Volume 3, Iue 11, November 2013 ISSN: 2277 128X International Journal of Advanced Reearch in Computer Science and Software Engineering Reearch Paper Available online at: www.ijarce.com Tak Aignment in
More informationA PROBABILISTIC NOTION OF CAMERA GEOMETRY: CALIBRATED VS. UNCALIBRATED
A PROBABILISTIC NOTION OF CAMERA GEOMETRY: CALIBRATED VS. UNCALIBRATED Jutin Domke and Yianni Aloimono Computational Viion Laboratory, Center for Automation Reearch Univerity of Maryland College Park,
More informationDAROS: Distributed User-Server Assignment And Replication For Online Social Networking Applications
DAROS: Ditributed Uer-Server Aignment And Replication For Online Social Networking Application Thuan Duong-Ba School of EECS Oregon State Univerity Corvalli, OR 97330, USA Email: duongba@eec.oregontate.edu
More informationSet-based Approach for Lossless Graph Summarization using Locality Sensitive Hashing
Set-baed Approach for Lole Graph Summarization uing Locality Senitive Hahing Kifayat Ullah Khan Supervior: Young-Koo Lee Expected Graduation Date: Fall 0 Deptartment of Computer Engineering Kyung Hee Univerity
More information999 Computer System Network. (12) Patent Application Publication (10) Pub. No.: US 2006/ A1. (19) United States
(19) United State US 2006O1296.60A1 (12) Patent Application Publication (10) Pub. No.: Mueller et al. (43) Pub. Date: Jun. 15, 2006 (54) METHOD AND COMPUTER SYSTEM FOR QUEUE PROCESSING (76) Inventor: Wolfgang
More informationLaboratory Exercise 6
Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each type of circuit will be implemented in two
More informationLaboratory Exercise 6
Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each circuit will be decribed in Verilog and implemented
More informationRefining SIRAP with a Dedicated Resource Ceiling for Self-Blocking
Refining SIRAP with a Dedicated Reource Ceiling for Self-Blocking Mori Behnam, Thoma Nolte Mälardalen Real-Time Reearch Centre P.O. Box 883, SE-721 23 Väterå, Sweden {mori.behnam,thoma.nolte}@mdh.e ABSTRACT
More informationA Map Reduce Framework for Programming Graphics Processors
A Map Reduce Framework for Programming Graphic Proceor Bryan Catanzaro, Narayanan Sundaram and Kurt Keutzer Univerity of California, Berkeley 545Q Cory Hall Berkeley, California 94720 {catanzar, narayan,
More informationYOUNGMIN YI. B.S. in Computer Engineering, 2000 Seoul National University (SNU), Seoul, Korea
YOUNGMIN YI Parallel Computing Lab Phone: +1 (925) 348-1095 573 Soda Hall Email: ymyi@eecs.berkeley.edu Electrical Engineering and Computer Science Web: http://eecs.berkeley.edu/~ymyi University of California,
More informationA CLUSTERING-BASED HYBRID REPLICA CONTROL PROTOCOL FOR HIGH AVAILABILITY IN GRID ENVIRONMENT
Journal of Computer Science 10 (12): 2442-2449, 2014 ISSN: 1549-3636 2014 R. Latip et al., Thi open acce article i ditributed under a Creative Common Attribution (CC-BY) 3.0 licene doi:10.3844/jcp.2014.2442.2449
More information[N309] Feedforward Active Noise Control Systems with Online Secondary Path Modeling. Muhammad Tahir Akhtar, Masahide Abe, and Masayuki Kawamata
he 32nd International Congre and Expoition on Noie Control Engineering Jeju International Convention Center, Seogwipo, Korea, Augut 25-28, 2003 [N309] Feedforward Active Noie Control Sytem with Online
More informationA Sparse Shared-Memory Multifrontal Solver in SCAD Software
Proceeding of the International Multiconference on ISBN 978-83-6080--9 Computer Science and Information echnology, pp. 77 83 ISSN 896-709 A Spare Shared-Memory Multifrontal Solver in SCAD Software Sergiy
More informationBottom Up parsing. Bottom-up parsing. Steps in a shift-reduce parse. 1. s. 2. np. john. john. john. walks. walks.
Paring Technologie Outline Paring Technologie Outline Bottom Up paring Paring Technologie Paring Technologie Bottom-up paring Step in a hift-reduce pare top-down: try to grow a tree down from a category
More informationArchitecture and grid application of cluster computing system
Architecture and grid application of cluter computing ytem Yi Lv*, Shuiqin Yu, Youju Mao Intitute of Optical Fiber Telecom, Chongqing Univ. of Pot & Telecom, Chongqing, 400065, P.R.China ABSTRACT Recently,
More informationParallel Approaches for Intervals Analysis of Variable Statistics in Large and Sparse Linear Equations with RHS Ranges
American Journal of Applied Science 4 (5): 300-306, 2007 ISSN 1546-9239 2007 Science Publication Correponding Author: Parallel Approache for Interval Analyi of Variable Statitic in Large and Spare Linear
More informationTopics. Lecture 37: Global Optimization. Issues. A Simple Example: Copy Propagation X := 3 B > 0 Y := 0 X := 4 Y := Z + W A := 2 * 3X
Lecture 37: Global Optimization [Adapted from note by R. Bodik and G. Necula] Topic Global optimization refer to program optimization that encompa multiple baic block in a function. (I have ued the term
More informationDelaunay Triangulation: Incremental Construction
Chapter 6 Delaunay Triangulation: Incremental Contruction In the lat lecture, we have learned about the Lawon ip algorithm that compute a Delaunay triangulation of a given n-point et P R 2 with O(n 2 )
More informationxy-monotone path existence queries in a rectilinear environment
CCCG 2012, Charlottetown, P.E.I., Augut 8 10, 2012 xy-monotone path exitence querie in a rectilinear environment Gregory Bint Anil Mahehwari Michiel Smid Abtract Given a planar environment coniting of
More informationCompiler Construction
Compiler Contruction Lecture 6 - An Introduction to Bottom- Up Paring 3 Robert M. Siegfried All right reerved Bottom-up Paring Bottom-up parer pare a program from the leave of a pare tree, collecting the
More informationA Basic Prototype for Enterprise Level Project Management
A Baic Prototype for Enterprie Level Project Management Saurabh Malgaonkar, Abhay Kolhe Computer Engineering Department, Mukeh Patel School of Technology Management & Engineering, NMIMS Univerity, Mumbai,
More informationIMPROVED JPEG DECOMPRESSION OF DOCUMENT IMAGES BASED ON IMAGE SEGMENTATION. Tak-Shing Wong, Charles A. Bouman, and Ilya Pollak
IMPROVED DECOMPRESSION OF DOCUMENT IMAGES BASED ON IMAGE SEGMENTATION Tak-Shing Wong, Charle A. Bouman, and Ilya Pollak School of Electrical and Computer Engineering Purdue Univerity ABSTRACT We propoe
More informationProgrammazione di sistemi multicore
Programmazione di itemi multicore A.A. 2015-2016 LECTURE 9 IRENE FINOCCHI http://wwwuer.di.uniroma1.it/~inocchi/ More complex parallel pattern PARALLEL PREFIX PARALLEL SORTING PARALLEL MERGESORT PARALLEL
More informationTrainable Context Model for Multiscale Segmentation
Trainable Context Model for Multicale Segmentation Hui Cheng and Charle A. Bouman School of Electrical and Computer Engineering Purdue Univerity Wet Lafayette, IN 47907-1285 {hui, bouman}@ ecn.purdue.edu
More informationVLSI Design 9. Datapath Design
VLSI Deign 9. Datapath Deign 9. Datapath Deign Lat module: Adder circuit Simple adder Fat addition Thi module omparator Shifter Multi-input Adder Multiplier omparator detector: A = 1 detector: A = 11 111
More informationSIMIT 7. Component Type Editor (CTE) User manual. Siemens Industrial
SIMIT 7 Component Type Editor (CTE) Uer manual Siemen Indutrial Edition January 2013 Siemen offer imulation oftware to plan, imulate and optimize plant and machine. The imulation- and optimizationreult
More informationAspects of Formal and Graphical Design of a Bus System
Apect of Formal and Graphical Deign of a Bu Sytem Tiberiu Seceleanu Univerity of Turku, Dpt. of Information Technology Turku, Finland tiberiu.eceleanu@utu.fi Tomi Weterlund Turku Centre for Computer Science
More informationAudio-Visual Voice Command Recognition in Noisy Conditions
Audio-Viual Voice Command Recognition in Noiy Condition Joef Chaloupka, Jan Nouza, Jindrich Zdanky Laboratory of Computer Speech Proceing, Intitute of Information Technology and Electronic, Technical Univerity
More informationUninformed Search Complexity. Informed Search. Search Revisited. Day 2/3 of Search
Informed Search ay 2/3 of Search hap. 4, Ruel & Norvig FS IFS US PFS MEM FS IS Uninformed Search omplexity N = Total number of tate = verage number of ucceor (branching factor) L = Length for tart to goal
More informationMulticast with Network Coding in Application-Layer Overlay Networks
IEEE JOURNAL ON SELECTED AREAS IN COMMUNICATIONS, VOL. 22, NO. 1, JANUARY 2004 1 Multicat with Network Coding in Application-Layer Overlay Network Ying Zhu, Baochun Li, Member, IEEE, and Jiang Guo Abtract
More informationOn the Efficacy of a Fused CPU+GPU Processor (or APU) for Parallel Computing
On the Efficacy of a Fued CPU+GPU Proceor (or APU) for Parallel Computing Mayank Daga, Ahwin M. Aji, and Wu-chun Feng Dept. of Computer Science Sampling of field that ue GPU Mac OS X Comology Molecular
More informationmapping reult. Our experiment have revealed that for many popular tream application, uch a networking and multimedia application, the number of VC nee
Reolving Deadlock for Pipelined Stream Application on Network-on-Chip Xiaohang Wang 1,2, Peng Liu 1 1 Department of Information Science and Electronic Engineering, Zheiang Univerity Hangzhou, Zheiang,
More informationLaboratory Exercise 6
Laboratory Exercie 6 Adder, Subtractor, and Multiplier The purpoe of thi exercie i to examine arithmetic circuit that add, ubtract, and multiply number. Each circuit will be decribed in VHL and implemented
More informationChapter S:II (continued)
Chapter S:II (continued) II. Baic Search Algorithm Sytematic Search Graph Theory Baic State Space Search Depth-Firt Search Backtracking Breadth-Firt Search Uniform-Cot Search AND-OR Graph Baic Depth-Firt
More information(12) Patent Application Publication (10) Pub. No.: US 2011/ A1
(19) United State US 2011 0316690A1 (12) Patent Application Publication (10) Pub. No.: US 2011/0316690 A1 Siegman (43) Pub. Date: Dec. 29, 2011 (54) SYSTEMAND METHOD FOR IDENTIFYING ELECTRICAL EQUIPMENT
More informationOn successive packing approach to multidimensional (M-D) interleaving
On ucceive packing approach to multidimenional (M-D) interleaving Xi Min Zhang Yun Q. hi ankar Bau Abtract We propoe an interleaving cheme for multidimenional (M-D) interleaving. To achieved by uing a
More informationCutting Stock by Iterated Matching. Andreas Fritsch, Oliver Vornberger. University of Osnabruck. D Osnabruck.
Cutting Stock by Iterated Matching Andrea Fritch, Oliver Vornberger Univerity of Onabruck Dept of Math/Computer Science D-4909 Onabruck andy@informatikuni-onabrueckde Abtract The combinatorial optimization
More informationSource Code (C) Phantom Support Sytem Generic Front-End Compiler BB CFG Code Generation ANSI C Single-threaded Application Phantom Call Identifier AEB
Code Partitioning for Synthei of Embedded Application with Phantom André C. Nácul, Tony Givargi Department of Computer Science Univerity of California, Irvine {nacul, givargi@ic.uci.edu ABSTRACT In a large
More informationMinimum congestion spanning trees in bipartite and random graphs
Minimum congetion panning tree in bipartite and random graph M.I. Otrovkii Department of Mathematic and Computer Science St. John Univerity 8000 Utopia Parkway Queen, NY 11439, USA e-mail: otrovm@tjohn.edu
More informationManeuverable Relays to Improve Energy Efficiency in Sensor Networks
Maneuverable Relay to Improve Energy Efficiency in Senor Network Stephan Eidenbenz, Luka Kroc, Jame P. Smith CCS-5, MS M997; Lo Alamo National Laboratory; Lo Alamo, NM 87545. Email: {eidenben, kroc, jpmith}@lanl.gov
More informationAN ALGORITHM FOR MINIMALLY LATENT GLOBAL VIRTUAL TIME. equal to the timestamp of the last event executed.
AN ALGORITHM FOR MINIMALLY LATENT GLOBAL VIRTUAL TIME Alexander I. Tomlinon Vijay K. Garg Department of Electrical and Computer Engineering The Univerity of Texa at Autin, Autin, Texa 78712 Thi paper appear
More informationNew Structural Decomposition Techniques for Constraint Satisfaction Problems
New Structural Decompoition Technique for Contraint Satifaction Problem Yaling Zheng and Berthe Y. Choueiry Contraint Sytem Laboratory Univerity of Nebraka-Lincoln Email: yzheng choueiry@ce.unl.edu Abtract.
More informationDomain-Specific Modeling for Rapid System-Wide Energy Estimation of Reconfigurable Architectures
Domain-Specific Modeling for Rapid Sytem-Wide Energy Etimation of Reconfigurable Architecture Seonil Choi 1,Ju-wookJang 2, Sumit Mohanty 1, Viktor K. Praanna 1 1 Dept. of Electrical Engg. 2 Dept. of Electronic
More informationA Practical Model for Minimizing Waiting Time in a Transit Network
A Practical Model for Minimizing Waiting Time in a Tranit Network Leila Dianat, MASc, Department of Civil Engineering, Sharif Univerity of Technology, Tehran, Iran Youef Shafahi, Ph.D. Aociate Profeor,
More informationSemi-Distributed Load Balancing for Massively Parallel Multicomputer Systems
Syracue Univerity SUFAC lectrical ngineering and Computer Science echnical eport College of ngineering and Computer Science 8-1991 Semi-Ditributed Load Balancing for aively Parallel ulticomputer Sytem
More informationSLA Adaptation for Service Overlay Networks
SLA Adaptation for Service Overlay Network Con Tran 1, Zbigniew Dziong 1, and Michal Pióro 2 1 Department of Electrical Engineering, École de Technologie Supérieure, Univerity of Quebec, Montréal, Canada
More informationThe underigned hereby recommend to the Faculty of Graduate Studie and Reearch aceeptance of the thei, Two Topic in Applied Algorithmic ubmitted by Pat
Two Topic in Applied Algorithmic By Patrick R. Morin A thei ubmitted to the Faculty of Graduate Studie and Reearch in partial fullment of the requirement for the degree of Mater of Computer Science Ottawa-Carleton
More informationSIMIT 7. Profinet IO Gateway. User Manual
SIMIT 7 Profinet IO Gateway Uer Manual Edition January 2013 Siemen offer imulation oftware to plan, imulate and optimize plant and machine. The imulation- and optimizationreult are only non-binding uggetion
More informationMulti-Target Tracking In Clutter
Multi-Target Tracking In Clutter John N. Sander-Reed, Mary Jo Duncan, W.B. Boucher, W. Michael Dimmler, Shawn O Keefe ABSTRACT A high frame rate (0 Hz), multi-target, video tracker ha been developed and
More informationLocalized Minimum Spanning Tree Based Multicast Routing with Energy-Efficient Guaranteed Delivery in Ad Hoc and Sensor Networks
Localized Minimum Spanning Tree Baed Multicat Routing with Energy-Efficient Guaranteed Delivery in Ad Hoc and Senor Network Hanne Frey Univerity of Paderborn D-3398 Paderborn hanne.frey@uni-paderborn.de
More informationDiverse: Application-Layer Service Differentiation in Peer-to-Peer Communications
Divere: Application-Layer Service Differentiation in Peer-to-Peer Communication Chuan Wu, Student Member, IEEE, Baochun Li, Senior Member, IEEE Department of Electrical and Computer Engineering Univerity
More informationComputer Arithmetic Homework Solutions. 1 An adder for graphics. 2 Partitioned adder. 3 HDL implementation of a partitioned adder
Computer Arithmetic Homework 3 2016 2017 Solution 1 An adder for graphic In a normal ripple carry addition of two poitive number, the carry i the ignal for a reult exceeding the maximum. We ue thi ignal
More informationEdits in Xylia Validity Preserving Editing of XML Documents
dit in Xylia Validity Preerving diting of XML Document Pouria Shaker, Theodore S. Norvell, and Denni K. Peter Faculty of ngineering and Applied Science, Memorial Univerity of Newfoundland, St. John, NFLD,
More informationA Multi-objective Genetic Algorithm for Reliability Optimization Problem
International Journal of Performability Engineering, Vol. 5, No. 3, April 2009, pp. 227-234. RAMS Conultant Printed in India A Multi-objective Genetic Algorithm for Reliability Optimization Problem AMAR
More informationThe norm Package. November 15, Title Analysis of multivariate normal datasets with missing values
The norm Package November 15, 2003 Verion 1.0-9 Date 2002/05/06 Title Analyi of multivariate normal dataet with miing value Author Ported to R by Alvaro A. Novo . Original by Joeph
More informationProblem Set 2 (Due: Friday, October 19, 2018)
Electrical and Computer Engineering Memorial Univerity of Newfoundland ENGI 9876 - Advanced Data Network Fall 2018 Problem Set 2 (Due: Friday, October 19, 2018) Quetion 1 Conider an M/M/1 model of a queue
More informationSynthesis of Signal Processing Structured Datapaths for FPGAs Supporting RAMs and Busses
Synthei of Signal Proceing Structured Datapath for FPGA Supporting AM and Bue Baher Haroun and Behzad Sajjadi Department of lectrical and Computer ngineering, Concordia Univerity 455 de Maionneuve Blvd.
More informationA TOPSIS based Method for Gene Selection for Cancer Classification
Volume 67 No17, April 2013 A TOPSIS baed Method for Gene Selection for Cancer Claification IMAbd-El Fattah,WIKhedr, KMSallam, 1 Department of Statitic, 3 Department of Deciion upport, 2 Department of information
More informationKS3 Maths Assessment Objectives
KS3 Math Aement Objective Tranition Stage 9 Ratio & Proportion Probabilit y & Statitic Appreciate the infinite nature of the et of integer, real and rational number Can interpret fraction and percentage
More information( ) subject to m. e (2) L are 2L+1. = s SEG SEG Las Vegas 2012 Annual Meeting Page 1
A new imultaneou ource eparation algorithm uing frequency-divere filtering Ying Ji*, Ed Kragh, and Phil Chritie, Schlumberger Cambridge Reearch Summary We decribe a new imultaneou ource eparation algorithm
More informationelse end while End References
621-630. [RM89] [SK76] Roenfeld, A. and Melter, R. A., Digital geometry, The Mathematical Intelligencer, vol. 11, No. 3, 1989, pp. 69-72. Sklanky, J. and Kibler, D. F., A theory of nonuniformly digitized
More informationA Linear Interpolation-Based Algorithm for Path Planning and Replanning on Girds *
Advance in Linear Algebra & Matrix Theory, 2012, 2, 20-24 http://dx.doi.org/10.4236/alamt.2012.22003 Publihed Online June 2012 (http://www.scirp.org/journal/alamt) A Linear Interpolation-Baed Algorithm
More informationA System Dynamics Model for Transient Availability Modeling of Repairable Redundant Systems
International Journal of Performability Engineering Vol., No. 3, May 05, pp. 03-. RAMS Conultant Printed in India A Sytem Dynamic Model for Tranient Availability Modeling of Repairable Redundant Sytem
More informationDevelopment of an atmospheric climate model with self-adapting grid and physics
Intitute of Phyic Publihing Journal of Phyic: Conference Serie 16 (2005) 353 357 doi:10.1088/1742-6596/16/1/049 SciDAC 2005 Development of an atmopheric climate model with elf-adapting grid and phyic Joyce
More information