Array transposition in CUDA shared memory
|
|
- Bryan Summers
- 5 years ago
- Views:
Transcription
1 Array transposton n CUDA shared memory Mke Gles February 19, 2014 Abstract Ths short note s nspred by some code wrtten by Jeremy Appleyard for the transposton of data through shared memory. I had some dffculty gettng my head around t, and decded t would be helpful to have a few fgures to explan t. I ve also extended t slghtly to cover more general cases. 1
2 j Fgure 1: Illustraton of array to be wrtten nto, and read from 1 Objectve As llustrated n the fgure above we want to work wth a shared memory array whch s mathematcally of wdth I, and heght 32 (equal to the warp sze). We want to effectvely transpose some data by wrtng nto t wth the threads n the thread block fllng t row-wse, workng across the frst row, then the second, and so on n ascendng order of + ji, and then (after a synchronsaton) readng data out of t column-wse, workng up the frst column, then the second and so n n ascendng order of j Or, vce versa, we mght want to fll t by columns, and then read t out by rows. The man applcaton for ths s for loadng n (or storng) data whch s stored as a Array-of-Structs, each of sze I. To acheve a coalesced read from devce memory usng a sngle warp, the threads can load n contguous vectors from devce memory and fll the shared memory array by rows. Then they can read t back nto regsters from columns so that each thread gets ts requred struct data. The process would be reversed for wrtng back to devce memory. In both cases, the array ndex n devce memory would correspond to +ji. The second applcaton arses n the ADI solvers whch Jeremy and I are workng on. Here, for part of the calculaton, n order to maxmse coalesced memory transfers t s effcent for threads to work on the array row-wse ntally, but there s then a mddle secton n whch t s necessary to work on columns wth a separate warp for each column, before fnally revertng to the orgnal thread mappng for the fnal stage. The challenge s to come up wth a mappng (, j) k to an ndex k n the lnear shared memory array so that there are no memory bank conflcts when accessng the data n ether drecton. 2
3 j Fgure 2: Shared memory ndces when I = 5. 2 I odd When I s odd we can defne k = + j I. Ths naturally gves no bank conflcts when readng row-wse, snce each warp gets 32 contguous addresses and current NVIDIA GPUs have 32 shared memory banks. Furthermore, there are no bank conflcts n each column, because j = 32 s the smallest strctly postve nteger such that j I mod 32 = 0 whch would lead to a bank conflct wth the element j =0. 3
4 padded by j 3 I a power of Fgure 3: Shared memory ndces when I = 4. When I s a power of 2, then the defnton k = + j I would lead to bank conflcts along each column. The frst bank conflct s when = 0, j = 8, k = 32. Ths suggests the dea of paddng by 1 after every 32 elements, gvng the mappng k = + j I + ( + ji)/32 where the dvson s nterpreted n the nteger sense (.e. dscardng the remander). 4
5 padded by j 4 General I Fgure 4: Shared memory ndces when I = 6. Havng handled the two extreme cases, now we consder the general case n whch I = P F, where P s a power of 2, and F s odd. In ths case k = + j I leads to the frst bank conflct when = 0 and If P 32 then ths mples j I mod 32 = 0. j F mod (32/P ) = 0, whch happens frst when j = 32/P, and hence ji = 32F. Thus, the padded defnton to avod conflcts s k = + j I + ( + j I)/(32F ). (Note: when I = F, then ( + j I)/(32F ) = 0 whch correctly gves us back the unpadded verson.) Alternatvely, f P > 32 then the frst bank conflct occurs when j = 1, and an approprate paddng s k = + j I + j. 5
6 Implementaton notes The smplest thng s to defne the (, j) pars for loadng and storng, and then compute the k for each. At worst, the paddng ncreases the shared memory requrements by approxmately 3%. The computaton of k requres 2 addtonal nteger operatons, a bt-shft and an addton. An alternatve, when I s even, s to use the mappng k = + j (I +1). Ths avods the 2 addtonal nteger operatons, but at the expense of usng more shared memory. 6
Complex Numbers. Now we also saw that if a and b were both positive then ab = a b. For a second let s forget that restriction and do the following.
Complex Numbers The last topc n ths secton s not really related to most of what we ve done n ths chapter, although t s somewhat related to the radcals secton as we wll see. We also won t need the materal
More informationCompiler Design. Spring Register Allocation. Sample Exercises and Solutions. Prof. Pedro C. Diniz
Compler Desgn Sprng 2014 Regster Allocaton Sample Exercses and Solutons Prof. Pedro C. Dnz USC / Informaton Scences Insttute 4676 Admralty Way, Sute 1001 Marna del Rey, Calforna 90292 pedro@s.edu Regster
More informationAssignment # 2. Farrukh Jabeen Algorithms 510 Assignment #2 Due Date: June 15, 2009.
Farrukh Jabeen Algorthms 51 Assgnment #2 Due Date: June 15, 29. Assgnment # 2 Chapter 3 Dscrete Fourer Transforms Implement the FFT for the DFT. Descrbed n sectons 3.1 and 3.2. Delverables: 1. Concse descrpton
More informationA Binarization Algorithm specialized on Document Images and Photos
A Bnarzaton Algorthm specalzed on Document mages and Photos Ergna Kavalleratou Dept. of nformaton and Communcaton Systems Engneerng Unversty of the Aegean kavalleratou@aegean.gr Abstract n ths paper, a
More informationLecture 5: Multilayer Perceptrons
Lecture 5: Multlayer Perceptrons Roger Grosse 1 Introducton So far, we ve only talked about lnear models: lnear regresson and lnear bnary classfers. We noted that there are functons that can t be represented
More informationTPL-Aware Displacement-driven Detailed Placement Refinement with Coloring Constraints
TPL-ware Dsplacement-drven Detaled Placement Refnement wth Colorng Constrants Tao Ln Iowa State Unversty tln@astate.edu Chrs Chu Iowa State Unversty cnchu@astate.edu BSTRCT To mnmze the effect of process
More informationRandom Kernel Perceptron on ATTiny2313 Microcontroller
Random Kernel Perceptron on ATTny233 Mcrocontroller Nemanja Djurc Department of Computer and Informaton Scences, Temple Unversty Phladelpha, PA 922, USA nemanja.djurc@temple.edu Slobodan Vucetc Department
More informationGSLM Operations Research II Fall 13/14
GSLM 58 Operatons Research II Fall /4 6. Separable Programmng Consder a general NLP mn f(x) s.t. g j (x) b j j =. m. Defnton 6.. The NLP s a separable program f ts objectve functon and all constrants are
More informationMathematics 256 a course in differential equations for engineering students
Mathematcs 56 a course n dfferental equatons for engneerng students Chapter 5. More effcent methods of numercal soluton Euler s method s qute neffcent. Because the error s essentally proportonal to the
More informationFor instance, ; the five basic number-sets are increasingly more n A B & B A A = B (1)
Secton 1.2 Subsets and the Boolean operatons on sets If every element of the set A s an element of the set B, we say that A s a subset of B, or that A s contaned n B, or that B contans A, and we wrte A
More informationBrave New World Pseudocode Reference
Brave New World Pseudocode Reference Pseudocode s a way to descrbe how to accomplsh tasks usng basc steps lke those a computer mght perform. In ths week s lab, you'll see how a form of pseudocode can be
More informationLecture - Data Encryption Standard 4
The Data Encrypton Standard For an encrypton algorthm we requre: secrecy of the key and not of the algorthm tself s the only thng that s needed to ensure the prvacy of the data the best cryptographc algorthms
More informationELEC 377 Operating Systems. Week 6 Class 3
ELEC 377 Operatng Systems Week 6 Class 3 Last Class Memory Management Memory Pagng Pagng Structure ELEC 377 Operatng Systems Today Pagng Szes Vrtual Memory Concept Demand Pagng ELEC 377 Operatng Systems
More informationHermite Splines in Lie Groups as Products of Geodesics
Hermte Splnes n Le Groups as Products of Geodescs Ethan Eade Updated May 28, 2017 1 Introducton 1.1 Goal Ths document defnes a curve n the Le group G parametrzed by tme and by structural parameters n the
More information3D vector computer graphics
3D vector computer graphcs Paolo Varagnolo: freelance engneer Padova Aprl 2016 Prvate Practce ----------------------------------- 1. Introducton Vector 3D model representaton n computer graphcs requres
More informationProgramming in Fortran 90 : 2017/2018
Programmng n Fortran 90 : 2017/2018 Programmng n Fortran 90 : 2017/2018 Exercse 1 : Evaluaton of functon dependng on nput Wrte a program who evaluate the functon f (x,y) for any two user specfed values
More informationAMath 483/583 Lecture 21 May 13, Notes: Notes: Jacobi iteration. Notes: Jacobi with OpenMP coarse grain
AMath 483/583 Lecture 21 May 13, 2011 Today: OpenMP and MPI versons of Jacob teraton Gauss-Sedel and SOR teratve methods Next week: More MPI Debuggng and totalvew GPU computng Read: Class notes and references
More informationSequential search. Building Java Programs Chapter 13. Sequential search. Sequential search
Sequental search Buldng Java Programs Chapter 13 Searchng and Sortng sequental search: Locates a target value n an array/lst by examnng each element from start to fnsh. How many elements wll t need to
More informationF Geometric Mean Graphs
Avalable at http://pvamu.edu/aam Appl. Appl. Math. ISSN: 1932-9466 Vol. 10, Issue 2 (December 2015), pp. 937-952 Applcatons and Appled Mathematcs: An Internatonal Journal (AAM) F Geometrc Mean Graphs A.
More informationSENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR
SENSITIVITY ANALYSIS IN LINEAR PROGRAMMING USING A CALCULATOR Judth Aronow Rchard Jarvnen Independent Consultant Dept of Math/Stat 559 Frost Wnona State Unversty Beaumont, TX 7776 Wnona, MN 55987 aronowju@hal.lamar.edu
More informationNews. Recap: While Loop Example. Reading. Recap: Do Loop Example. Recap: For Loop Example
Unversty of Brtsh Columba CPSC, Intro to Computaton Jan-Apr Tamara Munzner News Assgnment correctons to ASCIIArtste.java posted defntely read WebCT bboards Arrays Lecture, Tue Feb based on sldes by Kurt
More informationLoad Balancing for Hex-Cell Interconnection Network
Int. J. Communcatons, Network and System Scences,,, - Publshed Onlne Aprl n ScRes. http://www.scrp.org/journal/jcns http://dx.do.org/./jcns.. Load Balancng for Hex-Cell Interconnecton Network Saher Manaseer,
More informationThe Greedy Method. Outline and Reading. Change Money Problem. Greedy Algorithms. Applications of the Greedy Strategy. The Greedy Method Technique
//00 :0 AM Outlne and Readng The Greedy Method The Greedy Method Technque (secton.) Fractonal Knapsack Problem (secton..) Task Schedulng (secton..) Mnmum Spannng Trees (secton.) Change Money Problem Greedy
More informationR s s f. m y s. SPH3UW Unit 7.3 Spherical Concave Mirrors Page 1 of 12. Notes
SPH3UW Unt 7.3 Sphercal Concave Mrrors Page 1 of 1 Notes Physcs Tool box Concave Mrror If the reflectng surface takes place on the nner surface of the sphercal shape so that the centre of the mrror bulges
More informationON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE
Yordzhev K., Kostadnova H. Інформаційні технології в освіті ON SOME ENTERTAINING APPLICATIONS OF THE CONCEPT OF SET IN COMPUTER SCIENCE COURSE Yordzhev K., Kostadnova H. Some aspects of programmng educaton
More informationPrivate Information Retrieval (PIR)
2 Levente Buttyán Problem formulaton Alce wants to obtan nformaton from a database, but she does not want the database to learn whch nformaton she wanted e.g., Alce s an nvestor queryng a stock-market
More informationOn Some Entertaining Applications of the Concept of Set in Computer Science Course
On Some Entertanng Applcatons of the Concept of Set n Computer Scence Course Krasmr Yordzhev *, Hrstna Kostadnova ** * Assocate Professor Krasmr Yordzhev, Ph.D., Faculty of Mathematcs and Natural Scences,
More informationWishing you all a Total Quality New Year!
Total Qualty Management and Sx Sgma Post Graduate Program 214-15 Sesson 4 Vnay Kumar Kalakband Assstant Professor Operatons & Systems Area 1 Wshng you all a Total Qualty New Year! Hope you acheve Sx sgma
More informationThe stream cipher MICKEY-128 (version 1) Algorithm specification issue 1.0
The stream cpher MICKEY-128 (verson 1 Algorthm specfcaton ssue 1. Steve Babbage Vodafone Group R&D, Newbury, UK steve.babbage@vodafone.com Matthew Dodd Independent consultant matthew@mdodd.net www.mdodd.net
More informationParallel matrix-vector multiplication
Appendx A Parallel matrx-vector multplcaton The reduced transton matrx of the three-dmensonal cage model for gel electrophoress, descrbed n secton 3.2, becomes excessvely large for polymer lengths more
More informationParallelism for Nested Loops with Non-uniform and Flow Dependences
Parallelsm for Nested Loops wth Non-unform and Flow Dependences Sam-Jn Jeong Dept. of Informaton & Communcaton Engneerng, Cheonan Unversty, 5, Anseo-dong, Cheonan, Chungnam, 330-80, Korea. seong@cheonan.ac.kr
More informationCHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vidyanagar
CHARUTAR VIDYA MANDAL S SEMCOM Vallabh Vdyanagar Faculty Name: Am D. Trved Class: SYBCA Subject: US03CBCA03 (Advanced Data & Fle Structure) *UNIT 1 (ARRAYS AND TREES) **INTRODUCTION TO ARRAYS If we want
More informationThe Codesign Challenge
ECE 4530 Codesgn Challenge Fall 2007 Hardware/Software Codesgn The Codesgn Challenge Objectves In the codesgn challenge, your task s to accelerate a gven software reference mplementaton as fast as possble.
More informationA Fast Visual Tracking Algorithm Based on Circle Pixels Matching
A Fast Vsual Trackng Algorthm Based on Crcle Pxels Matchng Zhqang Hou hou_zhq@sohu.com Chongzhao Han czhan@mal.xjtu.edu.cn Ln Zheng Abstract: A fast vsual trackng algorthm based on crcle pxels matchng
More informationCE 221 Data Structures and Algorithms
CE 1 ata Structures and Algorthms Chapter 4: Trees BST Text: Read Wess, 4.3 Izmr Unversty of Economcs 1 The Search Tree AT Bnary Search Trees An mportant applcaton of bnary trees s n searchng. Let us assume
More informationSum of Linear and Fractional Multiobjective Programming Problem under Fuzzy Rules Constraints
Australan Journal of Basc and Appled Scences, 2(4): 1204-1208, 2008 ISSN 1991-8178 Sum of Lnear and Fractonal Multobjectve Programmng Problem under Fuzzy Rules Constrants 1 2 Sanjay Jan and Kalash Lachhwan
More informationIntroduction. Leslie Lamports Time, Clocks & the Ordering of Events in a Distributed System. Overview. Introduction Concepts: Time
Lesle Laports e, locks & the Orderng of Events n a Dstrbuted Syste Joseph Sprng Departent of oputer Scence Dstrbuted Systes and Securty Overvew Introducton he artal Orderng Logcal locks Orderng the Events
More informationSolving Planted Motif Problem on GPU
Solvng Planted Motf Problem on GPU Naga Shalaja Dasar Old Domnon Unversty Norfolk, VA, USA ndasar@cs.odu.edu Ranjan Desh Old Domnon Unversty Norfolk, VA, USA dranjan@cs.odu.edu Zubar M Old Domnon Unversty
More informationSorting and Algorithm Analysis
Unt 7 Sortng and Algorthm Analyss Computer Scence S-111 Harvard Unversty Davd G. Sullvan, Ph.D. Sortng an Array of Integers 0 1 2 n-2 n-1 arr 15 7 36 40 12 Ground rules: sort the values n ncreasng order
More informationLoop Transformations for Parallelism & Locality. Review. Scalar Expansion. Scalar Expansion: Motivation
Loop Transformatons for Parallelsm & Localty Last week Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Scalar expanson for removng false dependences Loop nterchange Loop
More informationLoop Permutation. Loop Transformations for Parallelism & Locality. Legality of Loop Interchange. Loop Interchange (cont)
Loop Transformatons for Parallelsm & Localty Prevously Data dependences and loops Loop transformatons Parallelzaton Loop nterchange Today Loop nterchange Loop transformatons and transformaton frameworks
More informationIntro. Iterators. 1. Access
Intro Ths mornng I d lke to talk a lttle bt about s and s. We wll start out wth smlartes and dfferences, then we wll see how to draw them n envronment dagrams, and we wll fnsh wth some examples. Happy
More informationVirtual Memory. Background. No. 10. Virtual Memory: concept. Logical Memory Space (review) Demand Paging(1) Virtual Memory
Background EECS. Operatng System Fundamentals No. Vrtual Memory Prof. Hu Jang Department of Electrcal Engneerng and Computer Scence, York Unversty Memory-management methods normally requres the entre process
More informationCache Memories. Lecture 14 Cache Memories. Inserting an L1 Cache Between the CPU and Main Memory. General Org of a Cache Memory
Topcs Lecture 4 Cache Memores Generc cache memory organzaton Drect mapped caches Set assocate caches Impact of caches on performance Cache Memores Cache memores are small, fast SRAM-based memores managed
More informationAn Optimal Algorithm for Prufer Codes *
J. Software Engneerng & Applcatons, 2009, 2: 111-115 do:10.4236/jsea.2009.22016 Publshed Onlne July 2009 (www.scrp.org/journal/jsea) An Optmal Algorthm for Prufer Codes * Xaodong Wang 1, 2, Le Wang 3,
More informationAn Iterative Solution Approach to Process Plant Layout using Mixed Integer Optimisation
17 th European Symposum on Computer Aded Process Engneerng ESCAPE17 V. Plesu and P.S. Agach (Edtors) 2007 Elsever B.V. All rghts reserved. 1 An Iteratve Soluton Approach to Process Plant Layout usng Mxed
More informationy and the total sum of
Lnear regresson Testng for non-lnearty In analytcal chemstry, lnear regresson s commonly used n the constructon of calbraton functons requred for analytcal technques such as gas chromatography, atomc absorpton
More informationCHAPTER 2 DECOMPOSITION OF GRAPHS
CHAPTER DECOMPOSITION OF GRAPHS. INTRODUCTION A graph H s called a Supersubdvson of a graph G f H s obtaned from G by replacng every edge uv of G by a bpartte graph,m (m may vary for each edge by dentfyng
More informationNAG Fortran Library Chapter Introduction. G10 Smoothing in Statistics
Introducton G10 NAG Fortran Lbrary Chapter Introducton G10 Smoothng n Statstcs Contents 1 Scope of the Chapter... 2 2 Background to the Problems... 2 2.1 Smoothng Methods... 2 2.2 Smoothng Splnes and Regresson
More information6.854 Advanced Algorithms Petar Maymounkov Problem Set 11 (November 23, 2005) With: Benjamin Rossman, Oren Weimann, and Pouya Kheradpour
6.854 Advanced Algorthms Petar Maymounkov Problem Set 11 (November 23, 2005) Wth: Benjamn Rossman, Oren Wemann, and Pouya Kheradpour Problem 1. We reduce vertex cover to MAX-SAT wth weghts, such that the
More informationA Fast Content-Based Multimedia Retrieval Technique Using Compressed Data
A Fast Content-Based Multmeda Retreval Technque Usng Compressed Data Borko Furht and Pornvt Saksobhavvat NSF Multmeda Laboratory Florda Atlantc Unversty, Boca Raton, Florda 3343 ABSTRACT In ths paper,
More informationCS221: Algorithms and Data Structures. Priority Queues and Heaps. Alan J. Hu (Borrowing slides from Steve Wolfman)
CS: Algorthms and Data Structures Prorty Queues and Heaps Alan J. Hu (Borrowng sldes from Steve Wolfman) Learnng Goals After ths unt, you should be able to: Provde examples of approprate applcatons for
More informationModule Management Tool in Software Development Organizations
Journal of Computer Scence (5): 8-, 7 ISSN 59-66 7 Scence Publcatons Management Tool n Software Development Organzatons Ahmad A. Al-Rababah and Mohammad A. Al-Rababah Faculty of IT, Al-Ahlyyah Amman Unversty,
More informationEmpirical Distributions of Parameter Estimates. in Binary Logistic Regression Using Bootstrap
Int. Journal of Math. Analyss, Vol. 8, 4, no. 5, 7-7 HIKARI Ltd, www.m-hkar.com http://dx.do.org/.988/jma.4.494 Emprcal Dstrbutons of Parameter Estmates n Bnary Logstc Regresson Usng Bootstrap Anwar Ftranto*
More informationA fault tree analysis strategy using binary decision diagrams
Loughborough Unversty Insttutonal Repostory A fault tree analyss strategy usng bnary decson dagrams Ths tem was submtted to Loughborough Unversty's Insttutonal Repostory by the/an author. Addtonal Informaton:
More informationTN348: Openlab Module - Colocalization
TN348: Openlab Module - Colocalzaton Topc The Colocalzaton module provdes the faclty to vsualze and quantfy colocalzaton between pars of mages. The Colocalzaton wndow contans a prevew of the two mages
More informationA New Approach For the Ranking of Fuzzy Sets With Different Heights
New pproach For the ankng of Fuzzy Sets Wth Dfferent Heghts Pushpnder Sngh School of Mathematcs Computer pplcatons Thapar Unversty, Patala-7 00 Inda pushpndersnl@gmalcom STCT ankng of fuzzy sets plays
More informationS1 Note. Basis functions.
S1 Note. Bass functons. Contents Types of bass functons...1 The Fourer bass...2 B-splne bass...3 Power and type I error rates wth dfferent numbers of bass functons...4 Table S1. Smulaton results of type
More informationSupport Vector Machines
Support Vector Machnes Decson surface s a hyperplane (lne n 2D) n feature space (smlar to the Perceptron) Arguably, the most mportant recent dscovery n machne learnng In a nutshell: map the data to a predetermned
More informationHigh-Boost Mesh Filtering for 3-D Shape Enhancement
Hgh-Boost Mesh Flterng for 3-D Shape Enhancement Hrokazu Yagou Λ Alexander Belyaev y Damng We z Λ y z ; ; Shape Modelng Laboratory, Unversty of Azu, Azu-Wakamatsu 965-8580 Japan y Computer Graphcs Group,
More informationMachine Learning: Algorithms and Applications
14/05/1 Machne Learnng: Algorthms and Applcatons Florano Zn Free Unversty of Bozen-Bolzano Faculty of Computer Scence Academc Year 011-01 Lecture 10: 14 May 01 Unsupervsed Learnng cont Sldes courtesy of
More informationCompiling Process Networks to Interaction Nets
Complng Process Networks to Interacton Nets Ian Macke LIX, CNRS UMR 7161, École Polytechnque, 91128 Palaseau Cede, France Kahn process networks are a model of computaton based on a collecton of sequental,
More informationEdge Detection in Noisy Images Using the Support Vector Machines
Edge Detecton n Nosy Images Usng the Support Vector Machnes Hlaro Gómez-Moreno, Saturnno Maldonado-Bascón, Francsco López-Ferreras Sgnal Theory and Communcatons Department. Unversty of Alcalá Crta. Madrd-Barcelona
More informationCable optimization of a long span cable stayed bridge in La Coruña (Spain)
Computer Aded Optmum Desgn n Engneerng XI 107 Cable optmzaton of a long span cable stayed brdge n La Coruña (Span) A. Baldomr & S. Hernández School of Cvl Engneerng, Unversty of Coruña, La Coruña, Span
More informationA High-Quality, Energy Optimized, Real-Time Sampling Rate Conversion Library for the StrongARM Microprocessor
A Hgh-Qualty, Energy Optmzed, Real-Tme Samplng Rate Converson Lbrary for the StrongARM Mcroprocessor Chung-Tse Mar 1, Mat C. Hans, Mark T. Smth, Tajana Smunc, Ronald W. Schafer 1 Moble& Meda Systems Laboratory
More informationReducing Frame Rate for Object Tracking
Reducng Frame Rate for Object Trackng Pavel Korshunov 1 and We Tsang Oo 2 1 Natonal Unversty of Sngapore, Sngapore 11977, pavelkor@comp.nus.edu.sg 2 Natonal Unversty of Sngapore, Sngapore 11977, oowt@comp.nus.edu.sg
More informationAccelerating X-Ray data collection using Pyramid Beam ray casting geometries
Acceleratng X-Ray data collecton usng Pyramd Beam ray castng geometres Amr Averbuch Guy Lfchtz Y. Shkolnsky 3 School of Computer Scence Department of Appled Mathematcs, School of Mathematcal Scences Tel
More informationChapter 6 Programmng the fnte element method Inow turn to the man subject of ths book: The mplementaton of the fnte element algorthm n computer programs. In order to make my dscusson as straghtforward
More informationAn Application of the Dulmage-Mendelsohn Decomposition to Sparse Null Space Bases of Full Row Rank Matrices
Internatonal Mathematcal Forum, Vol 7, 2012, no 52, 2549-2554 An Applcaton of the Dulmage-Mendelsohn Decomposton to Sparse Null Space Bases of Full Row Rank Matrces Mostafa Khorramzadeh Department of Mathematcal
More informationGarbling Gadgets for Boolean and Arithmetic Circuits
Garblng Gadgets for Boolean and Arthmetc Crcuts Marshall Ball Columba Unnversty New York, NY marshall@cs.columba.edu Tal Malkn Columba Unversty New York, NY tal@cs.columba.edu Mke Rosulek Oregon State
More informationSorting. Sorting. Why Sort? Consistent Ordering
Sortng CSE 6 Data Structures Unt 15 Readng: Sectons.1-. Bubble and Insert sort,.5 Heap sort, Secton..6 Radx sort, Secton.6 Mergesort, Secton. Qucksort, Secton.8 Lower bound Sortng Input an array A of data
More informationSpecifications in 2001
Specfcatons n 200 MISTY (updated : May 3, 2002) September 27, 200 Mtsubsh Electrc Corporaton Block Cpher Algorthm MISTY Ths document shows a complete descrpton of encrypton algorthm MISTY, whch are secret-key
More informationCMPS 10 Introduction to Computer Science Lecture Notes
CPS 0 Introducton to Computer Scence Lecture Notes Chapter : Algorthm Desgn How should we present algorthms? Natural languages lke Englsh, Spansh, or French whch are rch n nterpretaton and meanng are not
More informationAn Entropy-Based Approach to Integrated Information Needs Assessment
Dstrbuton Statement A: Approved for publc release; dstrbuton s unlmted. An Entropy-Based Approach to ntegrated nformaton Needs Assessment June 8, 2004 Wllam J. Farrell Lockheed Martn Advanced Technology
More informationProblem Set 3 Solutions
Introducton to Algorthms October 4, 2002 Massachusetts Insttute of Technology 6046J/18410J Professors Erk Demane and Shaf Goldwasser Handout 14 Problem Set 3 Solutons (Exercses were not to be turned n,
More informationSpace-Optimal, Wait-Free Real-Time Synchronization
1 Space-Optmal, Wat-Free Real-Tme Synchronzaton Hyeonjoong Cho, Bnoy Ravndran ECE Dept., Vrgna Tech Blacksburg, VA 24061, USA {hjcho,bnoy}@vt.edu E. Douglas Jensen The MITRE Corporaton Bedford, MA 01730,
More informationHigh level vs Low Level. What is a Computer Program? What does gcc do for you? Program = Instructions + Data. Basic Computer Organization
What s a Computer Program? Descrpton of algorthms and data structures to acheve a specfc ojectve Could e done n any language, even a natural language lke Englsh Programmng language: A Standard notaton
More informationSolving Route Planning Using Euler Path Transform
Solvng Route Plannng Usng Euler Path ransform Y-Chong Zeng Insttute of Informaton Scence Academa Snca awan ychongzeng@s.snca.edu.tw Abstract hs paper presents a method to solve route plannng problem n
More informationConfiguration Management in Multi-Context Reconfigurable Systems for Simultaneous Performance and Power Optimizations*
Confguraton Management n Mult-Context Reconfgurable Systems for Smultaneous Performance and Power Optmzatons* Rafael Maestre, Mlagros Fernandez Departamento de Arqutectura de Computadores y Automátca Unversdad
More informationETAtouch RESTful Webservices
ETAtouch RESTful Webservces Verson 1.1 November 8, 2012 Contents 1 Introducton 3 2 The resource /user/ap 6 2.1 HTTP GET................................... 6 2.2 HTTP POST..................................
More informationParallel Solutions of Indexed Recurrence Equations
Parallel Solutons of Indexed Recurrence Equatons Yos Ben-Asher Dep of Math and CS Hafa Unversty 905 Hafa, Israel yos@mathcshafaacl Gad Haber IBM Scence and Technology 905 Hafa, Israel haber@hafascvnetbmcom
More informationAPPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT
3. - 5. 5., Brno, Czech Republc, EU APPLICATION OF MULTIVARIATE LOSS FUNCTION FOR ASSESSMENT OF THE QUALITY OF TECHNOLOGICAL PROCESS MANAGEMENT Abstract Josef TOŠENOVSKÝ ) Lenka MONSPORTOVÁ ) Flp TOŠENOVSKÝ
More informationSupport Vector Machines
/9/207 MIST.6060 Busness Intellgence and Data Mnng What are Support Vector Machnes? Support Vector Machnes Support Vector Machnes (SVMs) are supervsed learnng technques that analyze data and recognze patterns.
More informationCache Performance 3/28/17. Agenda. Cache Abstraction and Metrics. Direct-Mapped Cache: Placement and Access
Agenda Cache Performance Samra Khan March 28, 217 Revew from last lecture Cache access Assocatvty Replacement Cache Performance Cache Abstracton and Metrcs Address Tag Store (s the address n the cache?
More informationUB at GeoCLEF Department of Geography Abstract
UB at GeoCLEF 2006 Mguel E. Ruz (1), Stuart Shapro (2), June Abbas (1), Slva B. Southwck (1) and Davd Mark (3) State Unversty of New York at Buffalo (1) Department of Lbrary and Informaton Studes (2) Department
More informationThe Erdős Pósa property for vertex- and edge-disjoint odd cycles in graphs on orientable surfaces
Dscrete Mathematcs 307 (2007) 764 768 www.elsever.com/locate/dsc Note The Erdős Pósa property for vertex- and edge-dsjont odd cycles n graphs on orentable surfaces Ken-Ich Kawarabayash a, Atsuhro Nakamoto
More informationRelated-Mode Attacks on CTR Encryption Mode
Internatonal Journal of Network Securty, Vol.4, No.3, PP.282 287, May 2007 282 Related-Mode Attacks on CTR Encrypton Mode Dayn Wang, Dongda Ln, and Wenlng Wu (Correspondng author: Dayn Wang) Key Laboratory
More informationFloating-Point Division Algorithms for an x86 Microprocessor with a Rectangular Multiplier
Floatng-Pont Dvson Algorthms for an x86 Mcroprocessor wth a Rectangular Multpler Mchael J. Schulte Dmtr Tan Carl E. Lemonds Unversty of Wsconsn Advanced Mcro Devces Advanced Mcro Devces Schulte@engr.wsc.edu
More informationData Representation in Digital Design, a Single Conversion Equation and a Formal Languages Approach
Data Representaton n Dgtal Desgn, a Sngle Converson Equaton and a Formal Languages Approach Hassan Farhat Unversty of Nebraska at Omaha Abstract- In the study of data representaton n dgtal desgn and computer
More informationEsc101 Lecture 1 st April, 2008 Generating Permutation
Esc101 Lecture 1 Aprl, 2008 Generatng Permutaton In ths class we wll look at a problem to wrte a program that takes as nput 1,2,...,N and prnts out all possble permutatons of the numbers 1,2,...,N. For
More informationsuch that is accepted of states in , where Finite Automata Lecture 2-1: Regular Languages be an FA. A string is the transition function,
* Lecture - Regular Languages S Lecture - Fnte Automata where A fnte automaton s a -tuple s a fnte set called the states s a fnte set called the alphabet s the transton functon s the ntal state s the set
More informationCHAPTER 10: ALGORITHM DESIGN TECHNIQUES
CHAPTER 10: ALGORITHM DESIGN TECHNIQUES So far, we have been concerned wth the effcent mplementaton of algorthms. We have seen that when an algorthm s gven, the actual data structures need not be specfed.
More informationSolving two-person zero-sum game by Matlab
Appled Mechancs and Materals Onlne: 2011-02-02 ISSN: 1662-7482, Vols. 50-51, pp 262-265 do:10.4028/www.scentfc.net/amm.50-51.262 2011 Trans Tech Publcatons, Swtzerland Solvng two-person zero-sum game by
More informationOutline. Midterm Review. Declaring Variables. Main Variable Data Types. Symbolic Constants. Arithmetic Operators. Midterm Review March 24, 2014
Mdterm Revew March 4, 4 Mdterm Revew Larry Caretto Mechancal Engneerng 9 Numercal Analyss of Engneerng Systems March 4, 4 Outlne VBA and MATLAB codng Varable types Control structures (Loopng and Choce)
More informationStorage Binding in RTL synthesis
Storage Bndng n RTL synthess Pe Zhang Danel D. Gajsk Techncal Report ICS-0-37 August 0th, 200 Center for Embedded Computer Systems Department of Informaton and Computer Scence Unersty of Calforna, Irne
More informationVectorization of Image Outlines Using Rational Spline and Genetic Algorithm
01 Internatonal Conference on Image, Vson and Computng (ICIVC 01) IPCSIT vol. 50 (01) (01) IACSIT Press, Sngapore DOI: 10.776/IPCSIT.01.V50.4 Vectorzaton of Image Outlnes Usng Ratonal Splne and Genetc
More information124 Chapter 8. Case Study: A Memory Component ndcatng some error condton. An exceptonal return of a value e s called rasng excepton e. A return s ssue
Chapter 8 Case Study: A Memory Component In chapter 6 we gave the outlne of a case study on the renement of a safe regster. In ths chapter wepresent the outne of another case study on persstent communcaton;
More informationA SYSTOLIC APPROACH TO LOOP PARTITIONING AND MAPPING INTO FIXED SIZE DISTRIBUTED MEMORY ARCHITECTURES
A SYSOLIC APPROACH O LOOP PARIIONING AND MAPPING INO FIXED SIZE DISRIBUED MEMORY ARCHIECURES Ioanns Drosts, Nektaros Kozrs, George Papakonstantnou and Panayots sanakas Natonal echncal Unversty of Athens
More informationTsinghua University at TAC 2009: Summarizing Multi-documents by Information Distance
Tsnghua Unversty at TAC 2009: Summarzng Mult-documents by Informaton Dstance Chong Long, Mnle Huang, Xaoyan Zhu State Key Laboratory of Intellgent Technology and Systems, Tsnghua Natonal Laboratory for
More informationKinematics of pantograph masts
Abstract Spacecraft Mechansms Group, ISRO Satellte Centre, Arport Road, Bangalore 560 07, Emal:bpn@sac.ernet.n Flght Dynamcs Dvson, ISRO Satellte Centre, Arport Road, Bangalore 560 07 Emal:pandyan@sac.ernet.n
More information