A Parallel Algorithm for Compressive Sensing
|
|
- Spencer Kelly
- 6 years ago
- Views:
Transcription
1 A Parallel Algorithm for ompressive Sensing Alexandre Borghi Laboratoire de Recherche en Informatique Universite Paris-Sud 11 July A Parallel Algorithm for ompressive Sensing 1
2 Idea of ompressive Sensing ompressive Sensing idea [andès et al. 2004]: under some sparsity assumptions one can exactly reconstruct a signal with few measurements Sparse signal = most of coefficients are 0 (in some basis). Measurements need to be taken in an appropriate space. Example: sparse gradient/measurements in Fourier domain Frequencies kept: 22 lines passing through the zero frequency A Parallel Algorithm for ompressive Sensing 2
3 ompressive Sensing [andes, Romberg, Tao,... ] Wish to minimize { min u J(u) s.t. Au = f with J being l 1 (or the Total Variation) where A is a compressive sensing matrix, and f the observed data In this part of the talk, we consider a penalization approach (LASSO) E µ (u) = J(u) + µ 2 Au f 2 2 Target: minimize E µ ( ) many available algorithms! mathematical design data structures/algorithm implementation efficiency depends on the available computing technologies parallel computing capacities that are specific to each processor not all algorithms can take benefit from available features A Parallel Algorithm for ompressive Sensing 3
4 l 1 -ompressive Sensing Sparsity minimizing l 1 -norm. (S) { min u u 1 s.t. Au = f where u IR n is the signal to reconstruct f IR m is the observed data A IR m n is a ompressive Sensing matrix m n A Parallel Algorithm for ompressive Sensing 4
5 Outline 1 What kind of parallel computing technologies are widely available? 2 Design of an approached l 1 -compressive sensing algorithm (sparse signal) A Parallel Algorithm for ompressive Sensing 5
6 Parallel Many-ore Architectures: PU oeur 1 oeur 2 oeur 3 oeur 4 oeur 1 oeur 2 oeur 3 oeur 4 N1 N1 N1 N1 N1+N2 N1+N2 N1+N2 N1+N2 N2 partagé N2 partagé N3 partagé FSB QPI Figure: Intel PU (left: ore 2 Quad; right: ore i7). A Parallel Algorithm for ompressive Sensing 6
7 Parallel Many-ore Architectures: ell SPE SPE SPE SPE Locale Locale Locale Locale N1 PPE EIB N2 Locale Locale Locale Locale SPE SPE SPE SPE Figure: ell. A Parallel Algorithm for ompressive Sensing 7
8 Parallel Many-ore Architectures: GPU L o c a l e N1 partagé PI Express L o c a l e Mémoire L o c a l e N1 partagé L o c a l e Mémoire Mémoire Mémoire L o c a l e N1 partagé PI Express L o c a l e L o c a l e... L o c a l e N1 partagé L o c a l e L o c a l e... Mémoire Mémoire Mémoire Mémoire Mémoire Mémoire Figure: NVIDIA GPU (left : G80; right : G200). A Parallel Algorithm for ompressive Sensing 8
9 Parallel Many-ore Architectures Restriction to Parallel Many-ore Architectures (widely available) Video card: NVidia GPU (> 96 cores) P: Intel processors multi-core (> 4 ores) Playstation 3 video game console ell (6-8 cores) cheap! Specific characteristics: GFLOPS, energy consumption, price... ommon features: Parallel: composed of several computing units Shared memory: central memory accessible from computing units System of cache hierarchy to improve data transfer A Parallel Algorithm for ompressive Sensing 9
10 Features and Requirements oarse parallelism: Several computational units Multi-core, GPU, ell synchronization issue Vectorization (SIMD): Process the data as a vector alignment issue: data must be accessed at specific addresses By far, the most important feature (because several SIMD units/core) Memory bandwidth: Amount of data that can be transfered data are transferred back and forth from the memory to the processor Starvation issue: a computing unit is waiting for the data By far, the most important issue A compiler (or Matlab) tries to do that automatically Performance highly depends on a successful code optimization Design algorithms implementations meet these requirements A Parallel Algorithm for ompressive Sensing 10
11 l 1 -compressive sensing 1 What kind of parallel computing technologies are widely available? 2 Design of an approached l 1 -compressive sensing algorithm (sparse signal) Moreau-Yosida Regularization/Proximal Points Proximal operator choices Algorithm Experiments and implementation A Parallel Algorithm for ompressive Sensing 11
12 Moreau-Yosida Regularization Recall, we wish to minimize E µ (u) = J(u) + µ 2 Au f 2 2 Moreau-Yosida Regularization [Moreau 65, Yosida 66]: Given a metric M and any current point u (k) { F µ (u (k) ) = inf E µ (u) + 1 } u IR n 2 u u(k) 2 M. This defines the proximal point p µ (u (k) ) and is characterized by: ( 0 E µ + 1 ) 2 u(k) 2 M (p µ (u (k) )) That is p µ (u (k)) = (M + E µ ) 1 ( Mu (k)). A Parallel Algorithm for ompressive Sensing 12
13 Proximal Operator hoices Standard convergence result [Lemarechal 97, Rockafellar 76,... ]: Iterating the proximal point generates a sequence that converges toward a minimizer of E µ How to choose M for parallel computations? Rewriting (M + E µ ) 1, we have J(u (k+1) ) + (µa t A + M)u (k+1) = µa t f + Mu (k). We wish µa t A + M diagonal problem of the form l 1 + g 2 2 So we take M = (1 + ɛ)µid µa t A A Parallel Algorithm for ompressive Sensing 13
14 Proximal Operator hoices With such a metric Solution is given by a shrinkage u (k+1) = 1 (1 + ɛ)µ M = (1 + ɛ)µid µa t A, { µa t f + Mu (k) sg ( µa t f + Mu (k)) if µa t f + Mu (k) > 1 0 otherwise, Shrinkage has been very well studied [hambolle, Daubechies, Elad, Figuereido, Yin,... ] and used in many efficient l 1 S algorithms Shrinkage is easily and efficiently implemented on many-cores Need to perform Au and A t Au in parallel A is a fast transform good (not great) parallel and vectorized version A is an explicit matrix great vectorized version but badly parallel (bandwith issue) A Parallel Algorithm for ompressive Sensing 14
15 Algorithm Recall, we wish to minimize (with µ large) E µ (u) = J(u) + µ 2 Au f 2 2 How to choose initial µ too small shrinkage yields 0 too big slow convergence choose of a good one by a bitonic search Algorithm: 1 Start with u = 0 2 ompute the initial value of µ using the above bitonic approach 3 For k=0 to l max Iterate the proximal point until some convergence criteria are met Au k f 2 2 f 2 2 < tolerance A Parallel Algorithm for ompressive Sensing 15
16 Exact solution from approximate solution 1 ompute the basis matrix B associated with u (k+1) by choosing the columns of A corresponding to the nonzero elements of u (k+1) 2 ompute v = B 1 f 3 Return u (k+2) in base A from v in base B by adding zeros where necessary A Parallel Algorithm for ompressive Sensing 16
17 Experiments and Implementation Implementation: shrinkage is parallel, vectorizable Au is either implemented as an explicit matrix multiplication or a Fast Transform (e.g., DT) explicit multiplication huge amount of data to transfer Experimental conditions Multi-core: Intel ore 2 Quad Q6600, 76 GFLOPS (theoretical) Multi-core: Intel ore i7 920, 86 GFLOPS (theoretical) ell: Sony Playstation 3, 159 GFLOPS (theoretical) GPU: NVidia 8800GTS, 345 GFLOPS (theoretical) GPU: NVidia GTX 275, 1010 GFLOPS (theoretical) A Parallel Algorithm for ompressive Sensing 17
18 Experiments: Errors Tolerance Ground truth Tolerance Ground truth Number of iterations A Parallel Algorithm for ompressive Sensing 18
19 Experiments: DT Wall clock time (s) ore 2 Quad (4 threads) 840 ore i7 (4 threads) 780 PS3 ell 720 NVIDIA 8800 GTS GPU 660 NVIDIA GTX 275 GPU K 2K 4K 8K 16K 32K K 2K 4K 8K 16K 32K 64K 128K 256K 512K 1M Signal size (n) A Parallel Algorithm for ompressive Sensing 19
20 Experiments: Orthogonalized Gaussian matrices For Intel ore i7 920 (2.66 GHz) Signal of size n with m observations and 0.1m spikes Relative error u found u true 2 u true 2 in out time (s) m n relative #iter 1 th. 2 th. 4 th. 8 th e e e e e e ompare to Bregman based approach [Yin, Osher, Goldfarb,Darbon 08] between 5/10 times faster. A Parallel Algorithm for ompressive Sensing 20
21 Experiments: Approximate versus Exact resolution Figure: A Parallel Algorithm omparaison for ompressiventre Sensing les temps d exécution de notre méthode approchée21 Temps (s) Approx (1 thread) Approx (4 threads) Exact (1 thread) Exact (4 threads) K 2K K 2K 4K 8K 16K n
22 Experiments: Versus Matching Pursuit Figure: A Parallel Algorithm omparaison for ompressive GPU Sensing entre notre méthode (Prox) et l algorithme Matching22 Temps (s) Prox MP K 2K 4K 5 1K 2K 4K 8K 16K n
23 onclusion So far: Use a proximal operator (to assure convergence) Design the operator to allow parallel programming The compiler will perform the parallel optimization for you On Moreau-Yosida/Proximal Point Huge literature gave birth to splitting/preconditioning General theoretical properties: convergence, robustness... [Rockafellar 98] or [Hiriart-Urriuty-Lemarechal 94] Many similar approaches using shrinkage (linearization for instance) using other Proximal points More in [Borghi, Darbon, Peyronnet, han, Osher 08] (ULA am report) Use similar idea for J(u) = u l 1 A Parallel Algorithm for ompressive Sensing 23
Compressed Sensing and L 1 -Related Minimization
Compressed Sensing and L 1 -Related Minimization Yin Wotao Computational and Applied Mathematics Rice University Jan 4, 2008 Chinese Academy of Sciences Inst. Comp. Math The Problems of Interest Unconstrained
More informationCompressive Sensing Algorithms for Fast and Accurate Imaging
Compressive Sensing Algorithms for Fast and Accurate Imaging Wotao Yin Department of Computational and Applied Mathematics, Rice University SCIMM 10 ASU, Tempe, AZ Acknowledgements: results come in part
More informationA parallel patch based algorithm for CT image denoising on the Cell Broadband Engine
A parallel patch based algorithm for CT image denoising on the Cell Broadband Engine Dominik Bartuschat, Markus Stürmer, Harald Köstler and Ulrich Rüde Friedrich-Alexander Universität Erlangen-Nürnberg,Germany
More informationModified Iterative Method for Recovery of Sparse Multiple Measurement Problems
Journal of Electrical Engineering 6 (2018) 124-128 doi: 10.17265/2328-2223/2018.02.009 D DAVID PUBLISHING Modified Iterative Method for Recovery of Sparse Multiple Measurement Problems Sina Mortazavi and
More informationSignal Reconstruction from Sparse Representations: An Introdu. Sensing
Signal Reconstruction from Sparse Representations: An Introduction to Compressed Sensing December 18, 2009 Digital Data Acquisition Suppose we want to acquire some real world signal digitally. Applications
More informationStructurally Random Matrices
Fast Compressive Sampling Using Structurally Random Matrices Presented by: Thong Do (thongdo@jhu.edu) The Johns Hopkins University A joint work with Prof. Trac Tran, The Johns Hopkins University it Dr.
More informationCombinatorial Selection and Least Absolute Shrinkage via The CLASH Operator
Combinatorial Selection and Least Absolute Shrinkage via The CLASH Operator Volkan Cevher Laboratory for Information and Inference Systems LIONS / EPFL http://lions.epfl.ch & Idiap Research Institute joint
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers With Adaptive Step Size Selection Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein October 8, 2015 1 / 30 Introduction Presentation Outline 1 Convex
More informationINSTITUTO SUPERIOR TÉCNICO. Architectures for Embedded Computing
UNIVERSIDADE TÉCNICA DE LISBOA INSTITUTO SUPERIOR TÉCNICO Departamento de Engenharia Informática Architectures for Embedded Computing MEIC-A, MEIC-T, MERC Lecture Slides Version 3.0 - English Lecture 12
More informationSparse Solutions to Linear Inverse Problems. Yuzhe Jin
Sparse Solutions to Linear Inverse Problems Yuzhe Jin Outline Intro/Background Two types of algorithms Forward Sequential Selection Methods Diversity Minimization Methods Experimental results Potential
More informationAdministrative Issues. L11: Sparse Linear Algebra on GPUs. Triangular Solve (STRSM) A Few Details 2/25/11. Next assignment, triangular solve
Administrative Issues L11: Sparse Linear Algebra on GPUs Next assignment, triangular solve Due 5PM, Tuesday, March 15 handin cs6963 lab 3 Project proposals Due 5PM, Wednesday, March 7 (hard
More informationAccelerating image registration on GPUs
Accelerating image registration on GPUs Harald Köstler, Sunil Ramgopal Tatavarty SIAM Conference on Imaging Science (IS10) 13.4.2010 Contents Motivation: Image registration with FAIR GPU Programming Combining
More informationCell Broadband Engine. Spencer Dennis Nicholas Barlow
Cell Broadband Engine Spencer Dennis Nicholas Barlow The Cell Processor Objective: [to bring] supercomputer power to everyday life Bridge the gap between conventional CPU s and high performance GPU s History
More informationLecture 19: November 5
0-725/36-725: Convex Optimization Fall 205 Lecturer: Ryan Tibshirani Lecture 9: November 5 Scribes: Hyun Ah Song Note: LaTeX template courtesy of UC Berkeley EECS dept. Disclaimer: These notes have not
More informationIBM Cell Processor. Gilbert Hendry Mark Kretschmann
IBM Cell Processor Gilbert Hendry Mark Kretschmann Architectural components Architectural security Programming Models Compiler Applications Performance Power and Cost Conclusion Outline Cell Architecture:
More informationLarge-scale Deep Unsupervised Learning using Graphics Processors
Large-scale Deep Unsupervised Learning using Graphics Processors Rajat Raina Anand Madhavan Andrew Y. Ng Stanford University Learning from unlabeled data Classify vs. car motorcycle Input space Unlabeled
More informationAn Iteratively Reweighted Least Square Implementation for Face Recognition
Vol. 6: 26-32 THE UNIVERSITY OF CENTRAL FLORIDA Published May 15, 2012 An Iteratively Reweighted Least Square Implementation for Face Recognition By: Jie Liang Faculty Mentor: Dr. Xin Li ABSTRACT: We propose,
More informationPIPA: A New Proximal Interior Point Algorithm for Large-Scale Convex Optimization
PIPA: A New Proximal Interior Point Algorithm for Large-Scale Convex Optimization Marie-Caroline Corbineau 1, Emilie Chouzenoux 1,2, Jean-Christophe Pesquet 1 1 CVN, CentraleSupélec, Université Paris-Saclay,
More information2009: The GPU Computing Tipping Point. Jen-Hsun Huang, CEO
2009: The GPU Computing Tipping Point Jen-Hsun Huang, CEO Someday, our graphics chips will have 1 TeraFLOPS of computing power, will be used for playing games to discovering cures for cancer to streaming
More informationNVIDIA GTX200: TeraFLOPS Visual Computing. August 26, 2008 John Tynefield
NVIDIA GTX200: TeraFLOPS Visual Computing August 26, 2008 John Tynefield 2 Outline Execution Model Architecture Demo 3 Execution Model 4 Software Architecture Applications DX10 OpenGL OpenCL CUDA C Host
More informationQR Decomposition on GPUs
QR Decomposition QR Algorithms Block Householder QR Andrew Kerr* 1 Dan Campbell 1 Mark Richards 2 1 Georgia Tech Research Institute 2 School of Electrical and Computer Engineering Georgia Institute of
More informationCISC 879 Software Support for Multicore Architectures Spring Student Presentation 6: April 8. Presenter: Pujan Kafle, Deephan Mohan
CISC 879 Software Support for Multicore Architectures Spring 2008 Student Presentation 6: April 8 Presenter: Pujan Kafle, Deephan Mohan Scribe: Kanik Sem The following two papers were presented: A Synchronous
More informationIntroduction to Topics in Machine Learning
Introduction to Topics in Machine Learning Namrata Vaswani Department of Electrical and Computer Engineering Iowa State University Namrata Vaswani 1/ 27 Compressed Sensing / Sparse Recovery: Given y :=
More informationInternational Conference on Computational Science (ICCS 2017)
International Conference on Computational Science (ICCS 2017) Exploiting Hybrid Parallelism in the Kinematic Analysis of Multibody Systems Based on Group Equations G. Bernabé, J. C. Cano, J. Cuenca, A.
More informationA fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009]
A fast algorithm for sparse reconstruction based on shrinkage, subspace optimization and continuation [Wen,Yin,Goldfarb,Zhang 2009] Yongjia Song University of Wisconsin-Madison April 22, 2010 Yongjia Song
More informationFast l 1 -Minimization and Parallelization for Face Recognition
Fast l 1 -Minimization and Parallelization for Face Recognition Victor Shia, Allen Y. Yang, and S. Shankar Sastry Department of EECS, UC Berkeley Berkeley, CA 9472, USA {vshia, yang, sastry} @ eecs.berkeley.edu
More informationExperiences with the Sparse Matrix-Vector Multiplication on a Many-core Processor
Experiences with the Sparse Matrix-Vector Multiplication on a Many-core Processor Juan C. Pichel Centro de Investigación en Tecnoloxías da Información (CITIUS) Universidade de Santiago de Compostela, Spain
More informationThe Benefit of Tree Sparsity in Accelerated MRI
The Benefit of Tree Sparsity in Accelerated MRI Chen Chen and Junzhou Huang Department of Computer Science and Engineering, The University of Texas at Arlington, TX, USA 76019 Abstract. The wavelet coefficients
More information2D and 3D Far-Field Radiation Patterns Reconstruction Based on Compressive Sensing
Progress In Electromagnetics Research M, Vol. 46, 47 56, 206 2D and 3D Far-Field Radiation Patterns Reconstruction Based on Compressive Sensing Berenice Verdin * and Patrick Debroux Abstract The measurement
More informationEfficient MR Image Reconstruction for Compressed MR Imaging
Efficient MR Image Reconstruction for Compressed MR Imaging Junzhou Huang, Shaoting Zhang, and Dimitris Metaxas Division of Computer and Information Sciences, Rutgers University, NJ, USA 08854 Abstract.
More informationELEG Compressive Sensing and Sparse Signal Representations
ELEG 867 - Compressive Sensing and Sparse Signal Representations Gonzalo R. Arce Depart. of Electrical and Computer Engineering University of Delaware Fall 211 Compressive Sensing G. Arce Fall, 211 1 /
More informationTHE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June COMP3320/6464/HONS High Performance Scientific Computing
THE AUSTRALIAN NATIONAL UNIVERSITY First Semester Examination June 2014 COMP3320/6464/HONS High Performance Scientific Computing Study Period: 15 minutes Time Allowed: 3 hours Permitted Materials: Non-Programmable
More informationComputer Systems Architecture I. CSE 560M Lecture 19 Prof. Patrick Crowley
Computer Systems Architecture I CSE 560M Lecture 19 Prof. Patrick Crowley Plan for Today Announcement No lecture next Wednesday (Thanksgiving holiday) Take Home Final Exam Available Dec 7 Due via email
More informationCONSOLE ARCHITECTURE
CONSOLE ARCHITECTURE Introduction Part 1 What is a console? Console components Differences between consoles and PCs Benefits of console development The development environment Console game design What
More informationSparse Reconstruction / Compressive Sensing
Sparse Reconstruction / Compressive Sensing Namrata Vaswani Department of Electrical and Computer Engineering Iowa State University Namrata Vaswani Sparse Reconstruction / Compressive Sensing 1/ 20 The
More informationADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING
ADAPTIVE LOW RANK AND SPARSE DECOMPOSITION OF VIDEO USING COMPRESSIVE SENSING Fei Yang 1 Hong Jiang 2 Zuowei Shen 3 Wei Deng 4 Dimitris Metaxas 1 1 Rutgers University 2 Bell Labs 3 National University
More informationAutomated Finite Element Computations in the FEniCS Framework using GPUs
Automated Finite Element Computations in the FEniCS Framework using GPUs Florian Rathgeber (f.rathgeber10@imperial.ac.uk) Advanced Modelling and Computation Group (AMCG) Department of Earth Science & Engineering
More informationIterative Shrinkage/Thresholding g Algorithms: Some History and Recent Development
Iterative Shrinkage/Thresholding g Algorithms: Some History and Recent Development Mário A. T. Figueiredo Instituto de Telecomunicações and Instituto Superior Técnico, Technical University of Lisbon PORTUGAL
More informationParallel and Distributed Sparse Optimization Algorithms
Parallel and Distributed Sparse Optimization Algorithms Part I Ruoyu Li 1 1 Department of Computer Science and Engineering University of Texas at Arlington March 19, 2015 Ruoyu Li (UTA) Parallel and Distributed
More informationCompressive Sensing for Multimedia. Communications in Wireless Sensor Networks
Compressive Sensing for Multimedia 1 Communications in Wireless Sensor Networks Wael Barakat & Rabih Saliba MDDSP Project Final Report Prof. Brian L. Evans May 9, 2008 Abstract Compressive Sensing is an
More informationsmooth coefficients H. Köstler, U. Rüde
A robust multigrid solver for the optical flow problem with non- smooth coefficients H. Köstler, U. Rüde Overview Optical Flow Problem Data term and various regularizers A Robust Multigrid Solver Galerkin
More informationRobust Principal Component Analysis (RPCA)
Robust Principal Component Analysis (RPCA) & Matrix decomposition: into low-rank and sparse components Zhenfang Hu 2010.4.1 reference [1] Chandrasekharan, V., Sanghavi, S., Parillo, P., Wilsky, A.: Ranksparsity
More informationGPGPUs in HPC. VILLE TIMONEN Åbo Akademi University CSC
GPGPUs in HPC VILLE TIMONEN Åbo Akademi University 2.11.2010 @ CSC Content Background How do GPUs pull off higher throughput Typical architecture Current situation & the future GPGPU languages A tale of
More informationConvex Optimization / Homework 2, due Oct 3
Convex Optimization 0-725/36-725 Homework 2, due Oct 3 Instructions: You must complete Problems 3 and either Problem 4 or Problem 5 (your choice between the two) When you submit the homework, upload a
More informationMassively Parallel Architectures
Massively Parallel Architectures A Take on Cell Processor and GPU programming Joel Falcou - LRI joel.falcou@lri.fr Bat. 490 - Bureau 104 20 janvier 2009 Motivation The CELL processor Harder,Better,Faster,Stronger
More informationMachine Learning / Jan 27, 2010
Revisiting Logistic Regression & Naïve Bayes Aarti Singh Machine Learning 10-701/15-781 Jan 27, 2010 Generative and Discriminative Classifiers Training classifiers involves learning a mapping f: X -> Y,
More informationRobust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma
Robust Face Recognition via Sparse Representation Authors: John Wright, Allen Y. Yang, Arvind Ganesh, S. Shankar Sastry, and Yi Ma Presented by Hu Han Jan. 30 2014 For CSE 902 by Prof. Anil K. Jain: Selected
More informationPacketShader: A GPU-Accelerated Software Router
PacketShader: A GPU-Accelerated Software Router Sangjin Han In collaboration with: Keon Jang, KyoungSoo Park, Sue Moon Advanced Networking Lab, CS, KAIST Networked and Distributed Computing Systems Lab,
More informationCentralized versus distributed schedulers for multiple bag-of-task applications
Centralized versus distributed schedulers for multiple bag-of-task applications O. Beaumont, L. Carter, J. Ferrante, A. Legrand, L. Marchal and Y. Robert Laboratoire LaBRI, CNRS Bordeaux, France Dept.
More informationHigh Performance Computing on GPUs using NVIDIA CUDA
High Performance Computing on GPUs using NVIDIA CUDA Slides include some material from GPGPU tutorial at SIGGRAPH2007: http://www.gpgpu.org/s2007 1 Outline Motivation Stream programming Simplified HW and
More informationState of Art and Project Proposals Intensive Computation
State of Art and Project Proposals Intensive Computation Annalisa Massini - 2015/2016 Today s lecture Project proposals on the following topics: Sparse Matrix- Vector Multiplication Tridiagonal Solvers
More informationI How does the formulation (5) serve the purpose of the composite parameterization
Supplemental Material to Identifying Alzheimer s Disease-Related Brain Regions from Multi-Modality Neuroimaging Data using Sparse Composite Linear Discrimination Analysis I How does the formulation (5)
More informationParallelism in Spiral
Parallelism in Spiral Franz Franchetti and the Spiral team (only part shown) Electrical and Computer Engineering Carnegie Mellon University Joint work with Yevgen Voronenko Markus Püschel This work was
More informationCellSs Making it easier to program the Cell Broadband Engine processor
Perez, Bellens, Badia, and Labarta CellSs Making it easier to program the Cell Broadband Engine processor Presented by: Mujahed Eleyat Outline Motivation Architecture of the cell processor Challenges of
More informationOptimization solutions for the segmented sum algorithmic function
Optimization solutions for the segmented sum algorithmic function ALEXANDRU PÎRJAN Department of Informatics, Statistics and Mathematics Romanian-American University 1B, Expozitiei Blvd., district 1, code
More informationSummer 2009 REU: Introduction to Some Advanced Topics in Computational Mathematics
Summer 2009 REU: Introduction to Some Advanced Topics in Computational Mathematics Moysey Brio & Paul Dostert July 4, 2009 1 / 18 Sparse Matrices In many areas of applied mathematics and modeling, one
More informationOptimizing the operations with sparse matrices on Intel architecture
Optimizing the operations with sparse matrices on Intel architecture Gladkikh V. S. victor.s.gladkikh@intel.com Intel Xeon, Intel Itanium are trademarks of Intel Corporation in the U.S. and other countries.
More informationMultiple-View Object Recognition in Band-Limited Distributed Camera Networks
in Band-Limited Distributed Camera Networks Allen Y. Yang, Subhransu Maji, Mario Christoudas, Kirak Hong, Posu Yan Trevor Darrell, Jitendra Malik, and Shankar Sastry Fusion, 2009 Classical Object Recognition
More informationhigh performance medical reconstruction using stream programming paradigms
high performance medical reconstruction using stream programming paradigms This Paper describes the implementation and results of CT reconstruction using Filtered Back Projection on various stream programming
More informationFlexible Batched Sparse Matrix-Vector Product on GPUs
ScalA'17: 8th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems November 13, 217 Flexible Batched Sparse Matrix-Vector Product on GPUs Hartwig Anzt, Gary Collins, Jack Dongarra,
More information(Sparse) Linear Solvers
(Sparse) Linear Solvers Ax = B Why? Many geometry processing applications boil down to: solve one or more linear systems Parameterization Editing Reconstruction Fairing Morphing 2 Don t you just invert
More informationIntroduction to Computing and Systems Architecture
Introduction to Computing and Systems Architecture 1. Computability A task is computable if a sequence of instructions can be described which, when followed, will complete such a task. This says little
More informationCompressed Sensing and Applications by using Dictionaries in Image Processing
Advances in Computational Sciences and Technology ISSN 0973-6107 Volume 10, Number 2 (2017) pp. 165-170 Research India Publications http://www.ripublication.com Compressed Sensing and Applications by using
More informationSystem Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries
System Demonstration of Spiral: Generator for High-Performance Linear Transform Libraries Yevgen Voronenko, Franz Franchetti, Frédéric de Mesmay, and Markus Püschel Department of Electrical and Computer
More informationParallel Computing. Hwansoo Han (SKKU)
Parallel Computing Hwansoo Han (SKKU) Unicore Limitations Performance scaling stopped due to Power consumption Wire delay DRAM latency Limitation in ILP 10000 SPEC CINT2000 2 cores/chip Xeon 3.0GHz Core2duo
More informationLect. 2: Types of Parallelism
Lect. 2: Types of Parallelism Parallelism in Hardware (Uniprocessor) Parallelism in a Uniprocessor Pipelining Superscalar, VLIW etc. SIMD instructions, Vector processors, GPUs Multiprocessor Symmetric
More informationThe Trainable Alternating Gradient Shrinkage method
The Trainable Alternating Gradient Shrinkage method Carlos Malanche, supervised by Ha Nguyen April 24, 2018 LIB (prof. Michael Unser), EPFL Quick example 1 The reconstruction problem Challenges in image
More informationTERM PAPER ON The Compressive Sensing Based on Biorthogonal Wavelet Basis
TERM PAPER ON The Compressive Sensing Based on Biorthogonal Wavelet Basis Submitted By: Amrita Mishra 11104163 Manoj C 11104059 Under the Guidance of Dr. Sumana Gupta Professor Department of Electrical
More informationImage Deblurring Using Adaptive Sparse Domain Selection and Adaptive Regularization
Volume 3, No. 3, May-June 2012 International Journal of Advanced Research in Computer Science RESEARCH PAPER Available Online at www.ijarcs.info ISSN No. 0976-5697 Image Deblurring Using Adaptive Sparse
More informationCompressed Sensing Algorithm for Real-Time Doppler Ultrasound Image Reconstruction
Mathematical Modelling and Applications 2017; 2(6): 75-80 http://www.sciencepublishinggroup.com/j/mma doi: 10.11648/j.mma.20170206.14 ISSN: 2575-1786 (Print); ISSN: 2575-1794 (Online) Compressed Sensing
More informationHigh Performance Computing: Tools and Applications
High Performance Computing: Tools and Applications Edmond Chow School of Computational Science and Engineering Georgia Institute of Technology Lecture 15 Numerically solve a 2D boundary value problem Example:
More informationCOMPRESSIVE VIDEO SAMPLING
COMPRESSIVE VIDEO SAMPLING Vladimir Stanković and Lina Stanković Dept of Electronic and Electrical Engineering University of Strathclyde, Glasgow, UK phone: +44-141-548-2679 email: {vladimir,lina}.stankovic@eee.strath.ac.uk
More informationAkarsh Pokkunuru EECS Department Contractive Auto-Encoders: Explicit Invariance During Feature Extraction
Akarsh Pokkunuru EECS Department 03-16-2017 Contractive Auto-Encoders: Explicit Invariance During Feature Extraction 1 AGENDA Introduction to Auto-encoders Types of Auto-encoders Analysis of different
More informationPARALLEL PROGRAMMING MANY-CORE COMPUTING: INTRO (1/5) Rob van Nieuwpoort
PARALLEL PROGRAMMING MANY-CORE COMPUTING: INTRO (1/5) Rob van Nieuwpoort rob@cs.vu.nl Schedule 2 1. Introduction, performance metrics & analysis 2. Many-core hardware 3. Cuda class 1: basics 4. Cuda class
More informationA MATLAB Interface to the GPU
Introduction Results, conclusions and further work References Department of Informatics Faculty of Mathematics and Natural Sciences University of Oslo June 2007 Introduction Results, conclusions and further
More informationImaging of flow in porous media - from optimal transport to prediction
Imaging of flow in porous media - from optimal transport to prediction Eldad Haber Dept of EOS and Mathematics, UBC October 15, 2013 With Rowan Lars Jenn Cocket Ruthotto Fohring Outline Prediction is very
More informationReconstruction Improvements on Compressive Sensing
SCITECH Volume 6, Issue 2 RESEARCH ORGANISATION November 21, 2017 Journal of Information Sciences and Computing Technologies www.scitecresearch.com/journals Reconstruction Improvements on Compressive Sensing
More informationWeighted-CS for reconstruction of highly under-sampled dynamic MRI sequences
Weighted- for reconstruction of highly under-sampled dynamic MRI sequences Dornoosh Zonoobi and Ashraf A. Kassim Dept. Electrical and Computer Engineering National University of Singapore, Singapore E-mail:
More informationCafeGPI. Single-Sided Communication for Scalable Deep Learning
CafeGPI Single-Sided Communication for Scalable Deep Learning Janis Keuper itwm.fraunhofer.de/ml Competence Center High Performance Computing Fraunhofer ITWM, Kaiserslautern, Germany Deep Neural Networks
More informationImage Restoration with Compound Regularization Using a Bregman Iterative Algorithm
Image Restoration with Compound Regularization Using a Bregman Iterative Algorithm Manya V. Afonso, José M. Bioucas-Dias, Mario A. T. Figueiredo To cite this version: Manya V. Afonso, José M. Bioucas-Dias,
More informationOptimization for Machine Learning
with a focus on proximal gradient descent algorithm Department of Computer Science and Engineering Outline 1 History & Trends 2 Proximal Gradient Descent 3 Three Applications A Brief History A. Convex
More information8. Hardware-Aware Numerics. Approaching supercomputing...
Approaching supercomputing... Numerisches Programmieren, Hans-Joachim Bungartz page 1 of 48 8.1. Hardware-Awareness Introduction Since numerical algorithms are ubiquitous, they have to run on a broad spectrum
More informationCell Processor and Playstation 3
Cell Processor and Playstation 3 Guillem Borrell i Nogueras February 24, 2009 Cell systems Bad news More bad news Good news Q&A IBM Blades QS21 Cell BE based. 8 SPE 460 Gflops Float 20 GFLops Double QS22
More informationB. Tech. Project Second Stage Report on
B. Tech. Project Second Stage Report on GPU Based Active Contours Submitted by Sumit Shekhar (05007028) Under the guidance of Prof Subhasis Chaudhuri Table of Contents 1. Introduction... 1 1.1 Graphic
More information8. Hardware-Aware Numerics. Approaching supercomputing...
Approaching supercomputing... Numerisches Programmieren, Hans-Joachim Bungartz page 1 of 22 8.1. Hardware-Awareness Introduction Since numerical algorithms are ubiquitous, they have to run on a broad spectrum
More informationAn R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation
An R Package flare for High Dimensional Linear Regression and Precision Matrix Estimation Xingguo Li Tuo Zhao Xiaoming Yuan Han Liu Abstract This paper describes an R package named flare, which implements
More informationConquering Massive Clinical Models with GPU. GPU Parallelized Logistic Regression
Conquering Massive Clinical Models with GPU Parallelized Logistic Regression M.D./Ph.D. candidate in Biomathematics University of California, Los Angeles Joint Statistical Meetings Vancouver, Canada, July
More informationOn the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters
1 On the Comparative Performance of Parallel Algorithms on Small GPU/CUDA Clusters N. P. Karunadasa & D. N. Ranasinghe University of Colombo School of Computing, Sri Lanka nishantha@opensource.lk, dnr@ucsc.cmb.ac.lk
More informationBilevel Sparse Coding
Adobe Research 345 Park Ave, San Jose, CA Mar 15, 2013 Outline 1 2 The learning model The learning algorithm 3 4 Sparse Modeling Many types of sensory data, e.g., images and audio, are in high-dimensional
More informationG Practical Magnetic Resonance Imaging II Sackler Institute of Biomedical Sciences New York University School of Medicine. Compressed Sensing
G16.4428 Practical Magnetic Resonance Imaging II Sackler Institute of Biomedical Sciences New York University School of Medicine Compressed Sensing Ricardo Otazo, PhD ricardo.otazo@nyumc.org Compressed
More informationCompressive. Graphical Models. Volkan Cevher. Rice University ELEC 633 / STAT 631 Class
Compressive Sensing and Graphical Models Volkan Cevher volkan@rice edu volkan@rice.edu Rice University ELEC 633 / STAT 631 Class http://www.ece.rice.edu/~vc3/elec633/ Digital Revolution Pressure is on
More informationIntroduction to Parallel and Distributed Computing. Linh B. Ngo CPSC 3620
Introduction to Parallel and Distributed Computing Linh B. Ngo CPSC 3620 Overview: What is Parallel Computing To be run using multiple processors A problem is broken into discrete parts that can be solved
More informationINF5063: Programming heterogeneous multi-core processors Introduction
INF5063: Programming heterogeneous multi-core processors Introduction Håkon Kvale Stensland August 19 th, 2012 INF5063 Overview Course topic and scope Background for the use and parallel processing using
More informationThe Alternating Direction Method of Multipliers
The Alternating Direction Method of Multipliers Customizable software solver package Peter Sutor, Jr. Project Advisor: Professor Tom Goldstein April 27, 2016 1 / 28 Background The Dual Problem Consider
More informationA Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography
1 A Scalable Parallel LSQR Algorithm for Solving Large-Scale Linear System for Seismic Tomography He Huang, Liqiang Wang, Po Chen(University of Wyoming) John Dennis (NCAR) 2 LSQR in Seismic Tomography
More informationA GPU Implementation of Tiled Belief Propagation on Markov Random Fields. Hassan Eslami Theodoros Kasampalis Maria Kotsifakou
A GPU Implementation of Tiled Belief Propagation on Markov Random Fields Hassan Eslami Theodoros Kasampalis Maria Kotsifakou BP-M AND TILED-BP 2 BP-M 3 Tiled BP T 0 T 1 T 2 T 3 T 4 T 5 T 6 T 7 T 8 4 Tiled
More informationG P G P U : H I G H - P E R F O R M A N C E C O M P U T I N G
Joined Advanced Student School (JASS) 2009 March 29 - April 7, 2009 St. Petersburg, Russia G P G P U : H I G H - P E R F O R M A N C E C O M P U T I N G Dmitry Puzyrev St. Petersburg State University Faculty
More informationMeasurements and Bits: Compressed Sensing meets Information Theory. Dror Baron ECE Department Rice University dsp.rice.edu/cs
Measurements and Bits: Compressed Sensing meets Information Theory Dror Baron ECE Department Rice University dsp.rice.edu/cs Sensing by Sampling Sample data at Nyquist rate Compress data using model (e.g.,
More informationCS 664 Structure and Motion. Daniel Huttenlocher
CS 664 Structure and Motion Daniel Huttenlocher Determining 3D Structure Consider set of 3D points X j seen by set of cameras with projection matrices P i Given only image coordinates x ij of each point
More informationPortability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures
Photos placed in horizontal position with even amount of white space between photos and header Portability and Scalability of Sparse Tensor Decompositions on CPU/MIC/GPU Architectures Christopher Forster,
More information