Benchmarking SpMV on Many-Core Architecture
|
|
- Maximilian Byrd
- 5 years ago
- Views:
Transcription
1 Bechmarkig SpMV o May-Core Architecture Biwei Xie ad Zhe Jia Istitute of Computig Techology Chiese Academy of Scieces Priceto Uiversity
2 Why Bechmarkig? To Measure Is To Kow -- William Thomso (Lord Kelvi)
3 We have a implemetatio
4 But, what if
5 Why SpMV?
6 SpMV: Sparse matrix-vector multiplicatio Give a sparse m matrix A ad a dese 1 vector x: y = A x: = A x y
7 HPC SpMV Applicatio Scearios Various liear algebra algorithms: CG (Cojugate Gradiets) Graph Computig Page Rak, BFS, etc. CNN Covolutio
8 Factors importat to SpMV performace Characteristics of the matrix Data scale Sparsity Sparse Patter SpMV Methods CSR, CSC,DIA, COO,CSR5,CVR, ELL,ESB,Merge (dozes) Parallelizatio, Vectorizatio, Blockig Platform X86, ARM, GPU Differet architecture desig impact the fial performace a lot. How to locate the best SpMV method for a give sparse matrix o a specific architecture?
9 Bechmarkig Methodology
10 Sparse Matrix (Data Set) Sparse Matrix 1,500+ sparse matrices from UFL (discard small matrices) Various sparse patters ad data scales Wikipedia -Talk Web-Google ASIC100k Wid Tuel Ecoomics Circuit5M Catilever Ga41As41H72 higgs-twitter Amazo-0312
11 SpMV algorithms: SpMV Methods 27 SpMV implemetatios. From high-quality research (ICS, SC, CGO ) From commercial/ope-source packages Widely used (CSR, COO ) CSR: By rows CSC: By colums BSR: By blocks DIA: By diagoals CVR: By multi-rows ELL: By blocks & rows
12 Platforms (Architectures) Architecture (May-core) Represetative may-core architectures: Itel Xeo Phi, GPGPU Much differet architecture desig: Itel Xeo Phi: KightsLadig,CMP+ SIMD GPGPU: NVidia Tesla M40, SIMT
13 Performace o CPU
14 Performace o CPU (Cotiued)
15 Observatios o CPU CSR, IE, CSR5 ad CVR, show good performace o both small ad large sparse matrices with various sparse patters. COO, CSC, ad DIA, which are widely used i realworld scearios, show much poorer performace. BSR, ESB-d ad ESB-s are sesitive to the sparse patters
16 Performace o GPU
17 Observatios o GPU BSR, HYB, ELL ad DIA are sesitive to the sparse patters Merge method is stable ad isesitive to sparse patters
18 Best Methods Distributio No sigle SpMV method is suitable for all sparse matrices Some methods show much better performace tha others CPU GPU
19 O Phi: O GPU: Optimal Methods Distributio CVR ad CSR5 are the optimal for more tha 84% data sets The optimal method is quite scattered. CSR5 occupies 56%.
20 Sub-optimal methods: Sub-optimal Slightly worse thathe best performace
21 Xeo Phi: CPU CVR is the best o more tha 82% matrices with less tha 20% performace loss to the optimal. Widely used SpMV methods, likecsr,csc, COO,DIAisot as good as expected. 100% 80% Optimal Sub-optimal(+10%) Sub-optimal(+20%) 60% 40% 20% 0% CVR CSR5 IE CSR VHCC ESB-s DIA BSR ESB-d CSC COO
22 Tesla: GPGPU CSR5 achieves sub-optimal o 65% sparse matrices with less tha 20% performace loss. ELL ad its derivative show modest performace 100% 80% Optimal Sub-optimal(+10%) Sub-optimal(+20%) 60% 40% 20% 0%
23 Correlatio : Correlatio Aalysis Aalyze the correlatio betwee SpMV performace ad the features(data scale, sparse patter ad etc.) Pearso correlatio coefficiet rage -1 to 1: 1: positive liear correlatio, 0: o liear correlatio, 1: total egative liear correlatio
24 Data scale: Correlatio Aalysis Withthe umber of o-zero elemets icreasig, the performace of most SpMV methods icrease.
25 Desity: Correlatio Aalysis Most SpMV methods show lower throughput whe the matrix is sparser.
26 Three factors: Coclusio Sparse matrix: sparse patter, data scale SpMV method: parallelizatio, vectorizatio, blockig Hardware platform: Xeo Phi, GPGPU Takig away: Certai methods ca achieve good performace for most data sets Some widely used methods, i.e., CSR, CSC, are ot as good as emergig oes For most SpMV methods, sparser matrix results i lower throughput Ope-source project: a bechmarkig framework, which supports almost all SpMV methods o Itel Xeo Phi ad GPGPU.
27 More result i the paper Comig soo
28 Thaks! Q&A
29 Slides for Defedig
DCMIX: Generating Mixed Workloads for the Cloud Data Center
DCMIX: Geeratig Mixed Workloads for the Cloud Data Ceter XigWag Xiog, Lei Wag, WaLig Gao, Rui Re, Ke Liu, Che Zheg, Yu We, YiLiag Istitute of Computig Techology, Chiese Academy of Scieces Bech 2018, Seattle,
More informationWhat are we going to learn? CSC Data Structures Analysis of Algorithms. Overview. Algorithm, and Inputs
What are we goig to lear? CSC316-003 Data Structures Aalysis of Algorithms Computer Sciece North Carolia State Uiversity Need to say that some algorithms are better tha others Criteria for evaluatio Structure
More informationTransforming Irregular Algorithms for Heterogeneous Computing - Case Studies in Bioinformatics
Trasformig Irregular lgorithms for Heterogeeous omputig - ase Studies i ioiformatics Jig Zhag dvisor: Dr. Wu Feg ollaborator: Hao Wag syergy.cs.vt.edu Irregular lgorithms haracterized by Operate o irregular
More informationOnes Assignment Method for Solving Traveling Salesman Problem
Joural of mathematics ad computer sciece 0 (0), 58-65 Oes Assigmet Method for Solvig Travelig Salesma Problem Hadi Basirzadeh Departmet of Mathematics, Shahid Chamra Uiversity, Ahvaz, Ira Article history:
More informationSPIRAL DSP Transform Compiler:
SPIRAL DSP Trasform Compiler: Applicatio Specific Hardware Sythesis Peter A. Milder (peter.milder@stoybroo.edu) Fraz Frachetti, James C. Hoe, ad Marus Pueschel Departmet of ECE Caregie Mello Uiversity
More informationEigenimages. Digital Image Processing: Bernd Girod, 2013 Stanford University -- Eigenimages 1
Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces
More informationAnalysis of Server Resource Consumption of Meteorological Satellite Application System Based on Contour Curve
Advaces i Computer, Sigals ad Systems (2018) 2: 19-25 Clausius Scietific Press, Caada Aalysis of Server Resource Cosumptio of Meteorological Satellite Applicatio System Based o Cotour Curve Xiagag Zhao
More informationLU Decomposition Method
SOLUTION OF SIMULTANEOUS LINEAR EQUATIONS LU Decompositio Method Jamie Traha, Autar Kaw, Kevi Marti Uiversity of South Florida Uited States of America kaw@eg.usf.edu http://umericalmethods.eg.usf.edu Itroductio
More informationCS2410 Computer Architecture. Flynn s Taxonomy
CS2410 Computer Architecture Dept. of Computer Sciece Uiversity of Pittsburgh http://www.cs.pitt.edu/~melhem/courses/2410p/idex.html 1 Fly s Taxoomy SISD Sigle istructio stream Sigle data stream (SIMD)
More informationMulti-Threading. Hyper-, Multi-, and Simultaneous Thread Execution
Multi-Threadig Hyper-, Multi-, ad Simultaeous Thread Executio 1 Performace To Date Icreasig processor performace Pipeliig. Brach predictio. Super-scalar executio. Out-of-order executio. Caches. Hyper-Threadig
More informationCluster Analysis. Andrew Kusiak Intelligent Systems Laboratory
Cluster Aalysis Adrew Kusiak Itelliget Systems Laboratory 2139 Seamas Ceter The Uiversity of Iowa Iowa City, Iowa 52242-1527 adrew-kusiak@uiowa.edu http://www.icae.uiowa.edu/~akusiak Two geeric modes of
More informationPorting the NAS-NPB Conjugate Gradient Benchmark to CUDA. NVIDIA Corporation
Porting the NAS-NPB Conjugate Gradient Benchmark to CUDA NVIDIA Corporation Outline! Overview of CG benchmark! Overview of CUDA Libraries! CUSPARSE! CUBLAS! Porting Sequence! Algorithm Analysis! Data/Code
More informationEigenimages. Digital Image Processing: Bernd Girod, Stanford University -- Eigenimages 1
Eigeimages Uitary trasforms Karhue-Loève trasform ad eigeimages Sirovich ad Kirby method Eigefaces for geder recogitio Fisher liear discrimat aalysis Fisherimages ad varyig illumiatio Fisherfaces vs. eigefaces
More informationEFFECT OF QUERY FORMATION ON WEB SEARCH ENGINE RESULTS
Iteratioal Joural o Natural Laguage Computig (IJNLC) Vol. 2, No., February 203 EFFECT OF QUERY FORMATION ON WEB SEARCH ENGINE RESULTS Raj Kishor Bisht ad Ila Pat Bisht 2 Departmet of Computer Sciece &
More informationImproving Template Based Spike Detection
Improvig Template Based Spike Detectio Kirk Smith, Member - IEEE Portlad State Uiversity petra@ee.pdx.edu Abstract Template matchig algorithms like SSE, Covolutio ad Maximum Likelihood are well kow for
More informationAn Efficient Algorithm for Graph Bisection of Triangularizations
A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045 Oe Brookigs Drive St. Louis, Missouri 63130-4899, USA jaegerg@cse.wustl.edu
More informationThe Penta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems
The Peta-S: A Scalable Crossbar Network for Distributed Shared Memory Multiprocessor Systems Abdulkarim Ayyad Departmet of Computer Egieerig, Al-Quds Uiversity, Jerusalem, P.O. Box 20002 Tel: 02-2797024,
More informationMorgan Kaufmann Publishers 26 February, COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. Chapter 5
Morga Kaufma Publishers 26 February, 28 COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Iterface 5 th Editio Chapter 5 Set-Associative Cache Architecture Performace Summary Whe CPU performace icreases:
More informationPolynomial Functions and Models. Learning Objectives. Polynomials. P (x) = a n x n + a n 1 x n a 1 x + a 0, a n 0
Polyomial Fuctios ad Models 1 Learig Objectives 1. Idetify polyomial fuctios ad their degree 2. Graph polyomial fuctios usig trasformatios 3. Idetify the real zeros of a polyomial fuctio ad their multiplicity
More informationWhich movie we can suggest to Anne?
ECOLE CENTRALE SUPELEC MASTER DSBI DECISION MODELING TUTORIAL COLLABORATIVE FILTERING AS A MODEL OF GROUP DECISION-MAKING You kow that the low-tech way to get recommedatios for products, movies, or etertaiig
More informationThe Closest Line to a Data Set in the Plane. David Gurney Southeastern Louisiana University Hammond, Louisiana
The Closest Lie to a Data Set i the Plae David Gurey Southeaster Louisiaa Uiversity Hammod, Louisiaa ABSTRACT This paper looks at three differet measures of distace betwee a lie ad a data set i the plae:
More informationEmpirical Validate C&K Suite for Predict Fault-Proneness of Object-Oriented Classes Developed Using Fuzzy Logic.
Empirical Validate C&K Suite for Predict Fault-Proeess of Object-Orieted Classes Developed Usig Fuzzy Logic. Mohammad Amro 1, Moataz Ahmed 1, Kaaa Faisal 2 1 Iformatio ad Computer Sciece Departmet, Kig
More informationLecture 2: Spectra of Graphs
Spectral Graph Theory ad Applicatios WS 20/202 Lecture 2: Spectra of Graphs Lecturer: Thomas Sauerwald & He Su Our goal is to use the properties of the adjacecy/laplacia matrix of graphs to first uderstad
More informationECE4050 Data Structures and Algorithms. Lecture 6: Searching
ECE4050 Data Structures ad Algorithms Lecture 6: Searchig 1 Search Give: Distict keys k 1, k 2,, k ad collectio L of records of the form (k 1, I 1 ), (k 2, I 2 ),, (k, I ) where I j is the iformatio associated
More informationUH-MEM: Utility-Based Hybrid Memory Management. Yang Li, Saugata Ghose, Jongmoo Choi, Jin Sun, Hui Wang, Onur Mutlu
UH-MEM: Utility-Based Hybrid Memory Maagemet Yag Li, Saugata Ghose, Jogmoo Choi, Ji Su, Hui Wag, Our Mutlu 1 Executive Summary DRAM faces sigificat techology scalig difficulties Emergig memory techologies
More informationInstruction and Data Streams
Advaced Architectures Master Iformatics Eg. 2017/18 A.J.Proeça Data Parallelism 1 (vector & SIMD extesios) (most slides are borrowed) AJProeça, Advaced Architectures, MiEI, UMiho, 2017/18 1 Istructio ad
More informationOptimal Mapped Mesh on the Circle
Koferece ANSYS 009 Optimal Mapped Mesh o the Circle doc. Ig. Jaroslav Štigler, Ph.D. Bro Uiversity of Techology, aculty of Mechaical gieerig, ergy Istitut, Abstract: This paper brigs out some ideas ad
More informationA New Morphological 3D Shape Decomposition: Grayscale Interframe Interpolation Method
A ew Morphological 3D Shape Decompositio: Grayscale Iterframe Iterpolatio Method D.. Vizireau Politehica Uiversity Bucharest, Romaia ae@comm.pub.ro R. M. Udrea Politehica Uiversity Bucharest, Romaia mihea@comm.pub.ro
More informationAn Efficient Algorithm for Graph Bisection of Triangularizations
Applied Mathematical Scieces, Vol. 1, 2007, o. 25, 1203-1215 A Efficiet Algorithm for Graph Bisectio of Triagularizatios Gerold Jäger Departmet of Computer Sciece Washigto Uiversity Campus Box 1045, Oe
More informationA Study on the Performance of Cholesky-Factorization using MPI
A Study o the Performace of Cholesky-Factorizatio usig MPI Ha S. Kim Scott B. Bade Departmet of Computer Sciece ad Egieerig Uiversity of Califoria Sa Diego {hskim, bade}@cs.ucsd.edu Abstract Cholesky-factorizatio
More informationAN OPTIMIZATION NETWORK FOR MATRIX INVERSION
397 AN OPTIMIZATION NETWORK FOR MATRIX INVERSION Ju-Seog Jag, S~ Youg Lee, ad Sag-Yug Shi Korea Advaced Istitute of Sciece ad Techology, P.O. Box 150, Cheogryag, Seoul, Korea ABSTRACT Iverse matrix calculatio
More informationA SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON
A SOFTWARE MODEL FOR THE MULTILAYER PERCEPTRON Roberto Lopez ad Eugeio Oñate Iteratioal Ceter for Numerical Methods i Egieerig (CIMNE) Edificio C1, Gra Capitá s/, 08034 Barceloa, Spai ABSTRACT I this work
More informationA Recursive Blocked Schur Algorithm for Computing the Matrix Square Root. Deadman, Edvin and Higham, Nicholas J. and Ralha, Rui. MIMS EPrint: 2012.
A Recursive Blocked Schur Algorithm for Computig the Matrix Square Root Deadma, Edvi ad Higham, Nicholas J. ad Ralha, Rui 212 MIMS EPrit: 212.26 Machester Istitute for Mathematical Scieces School of Mathematics
More informationLecture 5. Counting Sort / Radix Sort
Lecture 5. Coutig Sort / Radix Sort T. H. Corme, C. E. Leiserso ad R. L. Rivest Itroductio to Algorithms, 3rd Editio, MIT Press, 2009 Sugkyukwa Uiversity Hyuseug Choo choo@skku.edu Copyright 2000-2018
More informationPseudocode ( 1.1) Analysis of Algorithms. Primitive Operations. Pseudocode Details. Running Time ( 1.1) Estimating performance
Aalysis of Algorithms Iput Algorithm Output A algorithm is a step-by-step procedure for solvig a problem i a fiite amout of time. Pseudocode ( 1.1) High-level descriptio of a algorithm More structured
More informationGenerating and Automatically Tuning OpenCL Code for Sparse Linear Algebra
Generating and Automatically Tuning OpenCL Code for Sparse Linear Algebra Dominik Grewe Anton Lokhmotov Media Processing Division ARM School of Informatics University of Edinburgh December 13, 2010 Introduction
More information3D Model Retrieval Method Based on Sample Prediction
20 Iteratioal Coferece o Computer Commuicatio ad Maagemet Proc.of CSIT vol.5 (20) (20) IACSIT Press, Sigapore 3D Model Retrieval Method Based o Sample Predictio Qigche Zhag, Ya Tag* School of Computer
More informationCopyright 2016 Ramez Elmasri and Shamkant B. Navathe
Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe CHAPTER 18 Strategies for Query Processig Copyright 2016 Ramez Elmasri ad Shamkat B. Navathe Itroductio DBMS techiques to process a query Scaer idetifies
More informationEE123 Digital Signal Processing
Last Time EE Digital Sigal Processig Lecture 7 Block Covolutio, Overlap ad Add, FFT Discrete Fourier Trasform Properties of the Liear covolutio through circular Today Liear covolutio with Overlap ad add
More informationAPPLICATION NOTE PACE1750AE BUILT-IN FUNCTIONS
APPLICATION NOTE PACE175AE BUILT-IN UNCTIONS About This Note This applicatio brief is iteded to explai ad demostrate the use of the special fuctios that are built ito the PACE175AE processor. These powerful
More informationExtending The Sleuth Kit and its Underlying Model for Pooled Storage File System Forensic Analysis
Extedig The Sleuth Kit ad its Uderlyig Model for Pooled File System Foresic Aalysis Frauhofer Istitute for Commuicatio, Iformatio Processig ad Ergoomics Ja-Niclas Hilgert* Marti Lambertz Daiel Plohma ja-iclas.hilgert@fkie.frauhofer.de
More informationOutline. Applications of FFT in Communications. Fundamental FFT Algorithms. FFT Circuit Design Architectures. Conclusions
FFT Circuit Desig Outlie Applicatios of FFT i Commuicatios Fudametal FFT Algorithms FFT Circuit Desig Architectures Coclusios DAB Receiver Tuer OFDM Demodulator Chael Decoder Mpeg Audio Decoder 56/5/ 4/48
More informationSorting in Linear Time. Data Structures and Algorithms Andrei Bulatov
Sortig i Liear Time Data Structures ad Algorithms Adrei Bulatov Algorithms Sortig i Liear Time 7-2 Compariso Sorts The oly test that all the algorithms we have cosidered so far is compariso The oly iformatio
More informationPattern Recognition Systems Lab 1 Least Mean Squares
Patter Recogitio Systems Lab 1 Least Mea Squares 1. Objectives This laboratory work itroduces the OpeCV-based framework used throughout the course. I this assigmet a lie is fitted to a set of poits usig
More informationAn improved Thomas Algorithm for finite element matrix parallel computing
A improved Thomas Algorithm for fiite elemet matrix parallel computig Qigfeg Du 1a), Zogli Li 1b), Hogmei Zhag 2, Xili Lu 2, Liu Zhag 1 1) School of Software Egieerig, Togji Uivesity,Shaghai, 200092, Chia
More informationSemidefinite optimization of High Performance Linpack on heterogeneous cluster
Semidefiite optimizatio of High Performace Lipack o heterogeeous cluster Draško Tomić *, Luko Gjeero** Emir Imamagić** * Hewlett-Packard Croatia, Zagreb, Croatia ** Uiversity of Zagreb Uiversity Computig
More informationANALYSIS OF RATIONAL FUNCTION DEPENDENCY TO THE HEIGHT DISTRIBUTION OF GROUND CONTROL POINTS IN GEOMETRIC CORRECTION OF AERIAL AND SATELLITE IMAGES
ANALSIS OF RATIONAL FUNCTION DEPENDENC TO THE HEIGHT DISTRIBUTION OF GROUND CONTROL POINTS IN GEOMETRIC CORRECTION OF AERIAL AND SATELLITE IMAGES M. Hosseii, Departmet of Geomatics Egieerig, Faculty of
More informationDimensionality Reduction PCA
Dimesioality Reductio PCA Machie Learig CSE446 David Wadde (slides provided by Carlos Guestri) Uiversity of Washigto Feb 22, 2017 Carlos Guestri 2005-2017 1 Dimesioality reductio Iput data may have thousads
More informationEuclidean Distance Based Feature Selection for Fault Detection Prediction Model in Semiconductor Manufacturing Process
Vol.133 (Iformatio Techology ad Computer Sciece 016), pp.85-89 http://dx.doi.org/10.1457/astl.016. Euclidea Distace Based Feature Selectio for Fault Detectio Predictio Model i Semicoductor Maufacturig
More informationCommunication-Optimal Parallel Algorithm for Strassen s Matrix Multiplication
Commuicatio-Optimal arallel Algorithm for Strasse s Matrix Multiplicatio Grey Ballard James Demmel Olga Holtz Bejami Lipshitz Oded Schwartz Electrical Egieerig ad Computer Scieces Uiversity of Califoria
More informationData Analysis. Concepts and Techniques. Chapter 2. Chapter 2: Getting to Know Your Data. Data Objects and Attribute Types
Data Aalysis Cocepts ad Techiques Chapter 2 1 Chapter 2: Gettig to Kow Your Data Data Objects ad Attribute Types Basic Statistical Descriptios of Data Data Visualizatio Measurig Data Similarity ad Dissimilarity
More informationPerhaps the method will give that for every e > U f() > p - 3/+e There is o o-trivial upper boud for f() ad ot eve f() < Z - e. seems to be kow, where
ON MAXIMUM CHORDAL SUBGRAPH * Paul Erdos Mathematical Istitute of the Hugaria Academy of Scieces ad Reu Laskar Clemso Uiversity 1. Let G() deote a udirected graph, with vertices ad V(G) deote the vertex
More informationA Modified Multiband U Shaped and Microcontroller Shaped Fractal Antenna
al Joural o Recet ad Iovatio Treds i Computig ad Commuicatio ISSN: 221-8169 A Modified Multibad U Shaped ad Microcotroller Shaped Fractal Atea Shweta Goyal 1, Yogedra Kumar Katiyar 2 1 M.tech Scholar,
More informationKernel Smoothing Function and Choosing Bandwidth for Non-Parametric Regression Methods 1
Ozea Joural of Applied Scieces (), 009 Ozea Joural of Applied Scieces (), 009 ISSN 943-49 009 Ozea Publicatio Kerel Smoothig Fuctio ad Choosig Badwidth for No-Parametric Regressio Methods Murat Kayri ad
More informationGPUMP: a Multiple-Precision Integer Library for GPUs
GPUMP: a Multiple-Precisio Iteger Library for GPUs Kaiyog Zhao ad Xiaowe Chu Departmet of Computer Sciece, Hog Kog Baptist Uiversity Hog Kog, P. R. Chia Email: {kyzhao, chxw}@comp.hkbu.edu.hk Abstract
More informationChapter 3 Classification of FFT Processor Algorithms
Chapter Classificatio of FFT Processor Algorithms The computatioal complexity of the Discrete Fourier trasform (DFT) is very high. It requires () 2 complex multiplicatios ad () complex additios [5]. As
More informationJournal of Chemical and Pharmaceutical Research, 2013, 5(12): Research Article
Available olie www.jocpr.com Joural of Chemical ad Pharmaceutical Research, 2013, 5(12):745-749 Research Article ISSN : 0975-7384 CODEN(USA) : JCPRC5 K-meas algorithm i the optimal iitial cetroids based
More informationProbabilistic Fuzzy Time Series Method Based on Artificial Neural Network
America Joural of Itelliget Systems 206, 6(2): 42-47 DOI: 0.5923/j.ajis.2060602.02 Probabilistic Fuzzy Time Series Method Based o Artificial Neural Network Erol Egrioglu,*, Ere Bas, Cagdas Haka Aladag
More informationLower Bounds for Sorting
Liear Sortig Topics Covered: Lower Bouds for Sortig Coutig Sort Radix Sort Bucket Sort Lower Bouds for Sortig Compariso vs. o-compariso sortig Decisio tree model Worst case lower boud Compariso Sortig
More informationfound that now considerable work has been done in this started with some example, which motivates the later results.
8 Iteratioal Joural of Comuter Sciece & Emergig Techologies (E-ISSN: 44-64) Volume, Issue 4, December A Study o Adjacecy Matrix for Zero-Divisor Grahs over Fiite Rig of Gaussia Iteger Prajali, Amit Sharma
More informationRedundancy Allocation for Series Parallel Systems with Multiple Constraints and Sensitivity Analysis
IOSR Joural of Egieerig Redudacy Allocatio for Series Parallel Systems with Multiple Costraits ad Sesitivity Aalysis S. V. Suresh Babu, D.Maheswar 2, G. Ragaath 3 Y.Viaya Kumar d G.Sakaraiah e (Mechaical
More informationComputing a k-sparse n-length Discrete Fourier Transform using at most 4k samples and O(k log k) complexity
2013 IEEE Iteratioal Symposium o Iformatio Theory Computig a k-sparse -legth Discrete Fourier Trasform usig at most 4k samples ad O(k log k) complexity Sameer Pawar ad Kaa Ramchadra Dept of Electrical
More informationAccuracy Improvement in Camera Calibration
Accuracy Improvemet i Camera Calibratio FaJie L Qi Zag ad Reihard Klette CITR, Computer Sciece Departmet The Uiversity of Aucklad Tamaki Campus, Aucklad, New Zealad fli006, qza001@ec.aucklad.ac.z r.klette@aucklad.ac.z
More informationLecture 28: Data Link Layer
Automatic Repeat Request (ARQ) 2. Go ack N ARQ Although the Stop ad Wait ARQ is very simple, you ca easily show that it has very the low efficiecy. The low efficiecy comes from the fact that the trasmittig
More informationA graphical view of big-o notation. c*g(n) f(n) f(n) = O(g(n))
ca see that time required to search/sort grows with size of We How do space/time eeds of program grow with iput size? iput. time: cout umber of operatios as fuctio of iput Executio size operatio Assigmet:
More informationStone Images Retrieval Based on Color Histogram
Stoe Images Retrieval Based o Color Histogram Qiag Zhao, Jie Yag, Jigyi Yag, Hogxig Liu School of Iformatio Egieerig, Wuha Uiversity of Techology Wuha, Chia Abstract Stoe images color features are chose
More informationProtected points in ordered trees
Applied Mathematics Letters 008 56 50 www.elsevier.com/locate/aml Protected poits i ordered trees Gi-Sag Cheo a, Louis W. Shapiro b, a Departmet of Mathematics, Sugkyukwa Uiversity, Suwo 440-746, Republic
More informationDynamic Programming and Curve Fitting Based Road Boundary Detection
Dyamic Programmig ad Curve Fittig Based Road Boudary Detectio SHYAM PRASAD ADHIKARI, HYONGSUK KIM, Divisio of Electroics ad Iformatio Egieerig Chobuk Natioal Uiversity 664-4 Ga Deokji-Dog Jeoju-City Jeobuk
More informationSD vs. SD + One of the most important uses of sample statistics is to estimate the corresponding population parameters.
SD vs. SD + Oe of the most importat uses of sample statistics is to estimate the correspodig populatio parameters. The mea of a represetative sample is a good estimate of the mea of the populatio that
More informationTHIN LAYER ORIENTED MAGNETOSTATIC CALCULATION MODULE FOR ELMER FEM, BASED ON THE METHOD OF THE MOMENTS. Roman Szewczyk
THIN LAYER ORIENTED MAGNETOSTATIC CALCULATION MODULE FOR ELMER FEM, BASED ON THE METHOD OF THE MOMENTS Roma Szewczyk Istitute of Metrology ad Biomedical Egieerig, Warsaw Uiversity of Techology E-mail:
More informationBASED ON ITERATIVE ERROR-CORRECTION
A COHPARISO OF CRYPTAALYTIC PRICIPLES BASED O ITERATIVE ERROR-CORRECTIO Miodrag J. MihaljeviC ad Jova Dj. GoliC Istitute of Applied Mathematics ad Electroics. Belgrade School of Electrical Egieerig. Uiversity
More information( n+1 2 ) , position=(7+1)/2 =4,(median is observation #4) Median=10lb
Chapter 3 Descriptive Measures Measures of Ceter (Cetral Tedecy) These measures will tell us where is the ceter of our data or where most typical value of a data set lies Mode the value that occurs most
More informationSAMPLE VERSUS POPULATION. Population - consists of all possible measurements that can be made on a particular item or procedure.
SAMPLE VERSUS POPULATION Populatio - cosists of all possible measuremets that ca be made o a particular item or procedure. Ofte a populatio has a ifiite umber of data elemets Geerally expese to determie
More informationOur second algorithm. Comp 135 Machine Learning Computer Science Tufts University. Decision Trees. Decision Trees. Decision Trees.
Comp 135 Machie Learig Computer Sciece Tufts Uiversity Fall 2017 Roi Khardo Some of these slides were adapted from previous slides by Carla Brodley Our secod algorithm Let s look at a simple dataset for
More informationCubic Polynomial Curves with a Shape Parameter
roceedigs of the th WSEAS Iteratioal Coferece o Robotics Cotrol ad Maufacturig Techology Hagzhou Chia April -8 00 (pp5-70) Cubic olyomial Curves with a Shape arameter MO GUOLIANG ZHAO YANAN Iformatio ad
More information1&1 Next Level Hosting
1&1 Next Level Hostig Performace Level: Performace that grows with your requiremets Copyright 1&1 Iteret SE 2017 1ad1.com 2 1&1 NEXT LEVEL HOSTING 3 Fast page loadig ad short respose times play importat
More informationPerformance Plus Software Parameter Definitions
Performace Plus+ Software Parameter Defiitios/ Performace Plus Software Parameter Defiitios Chapma Techical Note-TG-5 paramete.doc ev-0-03 Performace Plus+ Software Parameter Defiitios/2 Backgroud ad Defiitios
More informationComputer Graphics Hardware An Overview
Computer Graphics Hardware A Overview Graphics System Moitor Iput devices CPU/Memory GPU Raster Graphics System Raster: A array of picture elemets Based o raster-sca TV techology The scree (ad a picture)
More informationFlexible Batched Sparse Matrix-Vector Product on GPUs
ScalA'17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems November 13, 217 Flexible Batched Sparse Matrix-Vector Product on GPUs Hartwig Anzt, Gary Collins, Jack Dongarra,
More informationn Explore virtualization concepts n Become familiar with cloud concepts
Chapter Objectives Explore virtualizatio cocepts Become familiar with cloud cocepts Chapter #15: Architecture ad Desig 2 Hypervisor Virtualizatio ad cloud services are becomig commo eterprise tools to
More informationPage 1. Why Care About the Memory Hierarchy? Memory. DRAMs over Time. Virtual Memory!
Why Care About the Memory Hierarchy? Memory Virtual Memory -DRAM Memory Gap (latecy) Reasos: Multi process systems (abstractio & memory protectio) Solutio: Tables (holdig per process traslatios) Fast traslatio
More informationWavelet Transform. CSE 490 G Introduction to Data Compression Winter Wavelet Transformed Barbara (Enhanced) Wavelet Transformed Barbara (Actual)
Wavelet Trasform CSE 49 G Itroductio to Data Compressio Witer 6 Wavelet Trasform Codig PACW Wavelet Trasform A family of atios that filters the data ito low resolutio data plus detail data high pass filter
More informationLecture 18. Optimization in n dimensions
Lecture 8 Optimizatio i dimesios Itroductio We ow cosider the problem of miimizig a sigle scalar fuctio of variables, f x, where x=[ x, x,, x ]T. The D case ca be visualized as fidig the lowest poit of
More informationExact Minimum Lower Bound Algorithm for Traveling Salesman Problem
Exact Miimum Lower Boud Algorithm for Travelig Salesma Problem Mohamed Eleiche GeoTiba Systems mohamed.eleiche@gmail.com Abstract The miimum-travel-cost algorithm is a dyamic programmig algorithm to compute
More informationLazy Type Changes in Object-oriented Database. Shan Ming Woo and Barbara Liskov MIT Lab. for Computer Science December 1999
Lazy Type Chages i Object-orieted Database Sha Mig Woo ad Barbara Liskov MIT Lab. for Computer Sciece December 1999 Backgroud wbehavior of OODB apps compose of behavior of persistet obj wbehavior of objects
More informationDynamic Sparse Matrix Allocation on GPUs. James King
Dynamic Sparse Matrix Allocation on GPUs James King Graph Applications Dynamic updates to graphs Adding edges add entries to sparse matrix representation Motivation Graph operations (adding edges) (e.g.
More informationBayesian approach to reliability modelling for a probability of failure on demand parameter
Bayesia approach to reliability modellig for a probability of failure o demad parameter BÖRCSÖK J., SCHAEFER S. Departmet of Computer Architecture ad System Programmig Uiversity Kassel, Wilhelmshöher Allee
More informationEfficient Hardware Design for Implementation of Matrix Multiplication by using PPI-SO
Efficiet Hardware Desig for Implemetatio of Matrix Multiplicatio by usig PPI-SO Shivagi Tiwari, Niti Meea Dept. of EC, IES College of Techology, Bhopal, Idia Assistat Professor, Dept. of EC, IES College
More informationImprovement of the Orthogonal Code Convolution Capabilities Using FPGA Implementation
Improvemet of the Orthogoal Code Covolutio Capabilities Usig FPGA Implemetatio Naima Kaabouch, Member, IEEE, Apara Dhirde, Member, IEEE, Saleh Faruque, Member, IEEE Departmet of Electrical Egieerig, Uiversity
More informationOutline and Reading. Analysis of Algorithms. Running Time. Experimental Studies. Limitations of Experiments. Theoretical Analysis
Outlie ad Readig Aalysis of Algorithms Iput Algorithm Output Ruig time ( 3.) Pseudo-code ( 3.2) Coutig primitive operatios ( 3.3-3.) Asymptotic otatio ( 3.6) Asymptotic aalysis ( 3.7) Case study Aalysis
More informationAuto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors
Auto-Tuning Strategies for Parallelizing Sparse Matrix-Vector (SpMV) Multiplication on Multi- and Many-Core Processors Kaixi Hou, Wu-chun Feng {kaixihou, wfeng}@vt.edu Shuai Che Shuai.Che@amd.com Sparse
More informationHeaps. Presentation for use with the textbook Algorithm Design and Applications, by M. T. Goodrich and R. Tamassia, Wiley, 2015
Presetatio for use with the textbook Algorithm Desig ad Applicatios, by M. T. Goodrich ad R. Tamassia, Wiley, 201 Heaps 201 Goodrich ad Tamassia xkcd. http://xkcd.com/83/. Tree. Used with permissio uder
More informationFuzzy Minimal Solution of Dual Fully Fuzzy Matrix Equations
Iteratioal Coferece o Applied Mathematics, Simulatio ad Modellig (AMSM 2016) Fuzzy Miimal Solutio of Dual Fully Fuzzy Matrix Equatios Dequa Shag1 ad Xiaobi Guo2,* 1 Sciece Courses eachig Departmet, Gasu
More informationDATA MINING II - 1DL460
DATA MINING II - 1DL460 Sprig 2017 A secod course i data miig http://www.it.uu.se/edu/course/homepage/ifoutv2/vt17/ Kjell Orsbor Uppsala Database Laboratory Departmet of Iformatio Techology, Uppsala Uiversity,
More informationFast PageRank Computation on a GPU Cluster
202 20th Euromicro Iteratioal Coferece o Parallel, Distributed ad Network-based Processig Fast PageRak Computatio o a GPU Cluster Aro Rugsawag ad Budit Maaskasemsak Massive Iformatio & Kowledge Egieerig
More informationPython Programming: An Introduction to Computer Science
Pytho Programmig: A Itroductio to Computer Sciece Chapter 1 Computers ad Programs 1 Objectives To uderstad the respective roles of hardware ad software i a computig system. To lear what computer scietists
More informationSwitching Hardware. Spring 2018 CS 438 Staff, University of Illinois 1
Switchig Hardware Sprig 208 CS 438 Staff, Uiversity of Illiois Where are we? Uderstad Differet ways to move through a etwork (forwardig) Read sigs at each switch (datagram) Follow a kow path (virtual circuit)
More informationMałgorzata Sterna. Mateusz Cicheński, Mateusz Jarus, Michał Miszkiewicz, Jarosław Szymczak
Małgorzata Stera Mateusz Cicheński, Mateusz Jarus, Michał Miszkiewicz, Jarosław Szymczak Istitute of Computig Sciece Pozań Uiversity of Techology Pozań - Polad Scope of the Talk Problem defiitio MP Formulatio
More informationTitle: Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity.
7 IEEE. Persoal use of this material is permitted. Permissio from IEEE must be obtaied for all other uses, i ay curret or future media, icludig repritig/republishig this material for advertisig or promotioal
More informationSecond-Order Domain Decomposition Method for Three-Dimensional Hyperbolic Problems
Iteratioal Mathematical Forum, Vol. 8, 013, o. 7, 311-317 Secod-Order Domai Decompositio Method for Three-Dimesioal Hyperbolic Problems Youbae Ju Departmet of Applied Mathematics Kumoh Natioal Istitute
More information