Architectural Support for Large-Scale Visual Search. Carlo C. del Mundo Vincent Lee Armin Alaghi Luis Ceze Mark Oskin
|
|
- Shannon Fleming
- 5 years ago
- Views:
Transcription
1 Architectural Support for Large-Scale Visual Search Carlo C. del Mundo Vincent Lee Armin Alaghi Luis Ceze Mark Oskin
2 Motivation: Visual Data & Their Applications Rebooting the IT Revolution, SIA, September Visual Applications 3D Reconstruction Augmented Reality Activity Recognition Content-based Search Algorithm Across These Apps: k-nearest neighbors (knn) Overall Challenges: High movement High compute requirements Idea: Co-design the memory subsytem with search algorithms.
3 State of the art: Today s Software Pipeline Feature Extraction 1 Extract shape, appearance, motion Feature Feature Feature Feature >1000 Index Features Scale of Index Trillions x Images Billions x Videos Billions x Audio Multimedia Database Query Extract features Feature Query Generation K-nearest neighbors 2?? Distance Calculation? Recover Result Result Result Result VEctor VEctor? Calculate pairwise metric distance Response Reverse Lookup Query Candidate Nearest Neighbor s 1 Wang et al., Action Recognition by Dense Trajectories, in CVPR Muja and Lowe, Scalable Nearest Neighbor Algorithms for High Dimensional Data, in PAMI 14.
4 State of the Art: Nearest Neighbor Search on Commodity Systems DRAM DRAM DRAM DRAM Intra-node Challenge: k-nearest neighbor search is heavily memory bound. Inter-node Challenge Each query vector generates k local responses. Query Video To get the global k nearest neighbors we need an all-to-one reduction; this makes the network the bottleneck.
5 Our Vision: Systems and Architecture Support for Nearest Neighbor Search Reduce Reduce Reduce Reduce Search Our Vision: A nearest neighbor content addressable memory () to alleviate memory boundedness by moving compute closer to. Query Video Reduce Reduce Reduction Our Vision: Systems support for innetwork joins and reductions to reduce global network bandwidth and scalability.
6 Intra-Node: Rebalancing Bandwidth and Compute Rebalance Bandwidth Use binary features 2. > 1000 Servicing One Query Algorithmic Growing Pains GB of communication 150 GFLOP of computation FLOP/byte FP Operators FMUL, FSUB FADD Bit Operators XOR POPCOUNT Rebalance Compute Move compute closer to memory. Storage SSDs or DRAM Our Vision. Nearest Neighbor CAM DRAM 1 Assume 512 KB feature vectors. N = 1 billion. Assume a single query needs to evaluate 10% of N with kd-trees. Distance metric is Euclidean. Note: we assume these numbers are from FLANN. 2 Gong and Lazebnik, Iterative Quantization: A Procustean Approach to Learning Binary Codes, in CVPR 11.
7 High-Level Overview: Nearest Neighbor Content Addressable Memory Idea: Achieve high internal bandwidth by interposing logic in memory. Our Vision: A class of memories akin to CAMs for high-throughput distance calculations. Input+ System+ Binary++Data k+=+2 Parameters distance+=+h id Output {id,+distance}
8 Architectural Overview: Inside the ODT ZQ RESET# CKE A12 CK, CK# CS# RAS# CAS# WE# Row Address A[15:0] Bank Address BA[2:0] Command Decode Control Mode Registers Address Register ZQCL, ZQCS Refresh Counter (RC = 8K) BC4 OTF ZQ CAL 3 3 Row Address MUX Bank Control To ODT/output drivers Bank 8 rowaddress latch and decoder Bank 2 rowaddress latch and decoder Bank 1 rowaddress latch and decoder Bank 0 rowaddress latch and decoder write Bank 0 Memory Array (65536 x 128 x 128) Sense Amplifiers knn Engine Accelerator Bypass Bank 1 Memory Array (65536 x 128 x 128) Sense Amplifiers write knn Engine read Accelerator Bypass write Bank 2 Memory Array (65536 x 128 x 128) Sense Amplifiers knn Engine read Accelerator Bypass write Bank 8 Memory Array (65536 x 128 x 128) Sense Amplifiers knn Engine read Accelerator Bypass r [31:0] Output Buffer ODT Control rvalid rready Engine Address EA[ADDRESS_BITS-1 :0] Engine Data w [31:0] Engine Operation cmd [2:0] Engine Control read Engine Read MUX I/O Gating DM Mask I/O Gating DM Mask I/O Gating DM Mask I/O Gating DM Mask
9 Programming Model: Integrating the in DRAMs DRAM Command Interface Programming API/Library Driver Layer Command Interface 1. additions are orthogonal to existing DRAM infrastructure. 2. We introduce a memory mapped configuration and control. 3. We introduce a command set which allows for configuration. DRAM Control Control Command Explanation DRAM Array and Controllers READ WRITE Reads the result out from the priority queues Writes initial query vector to accelerator memory DRAM Read/Write Interface Acceleration ENABLE DISABLE Signals to process read from DRAM Signals to ignore any from the DRAM DRAM Output Interface Systems Stack Output Interface RESET Reinitializes priority queues in preparation for next execution Command Set
10 Related Works: What came before the Software-based solutions Libraries [Arya, Mount, JACM 1998] ANN: Approximate Nearest Neighbors Library [Muja, Lowe, PAMI 2014] FLANN: Fast Library for Approximate Nearest Neighbors Algorithms [Datar et al., SCG 04] Locality-sensitive Hashing [Y. Weiss et al., NIPS 08] Spectral Hashing [Torralba et al., CVPR 08] Small codes for Recognition [Gong, Lazebnik, CVPR 11] Iterative Quantization Hardware-based solutions Nearest-Neighbors [Roberts, MS Thesis 90] PCAM: Proximity Content-Addressable Memory [Garcia et al., CVPRW 08] Nearest Neighbors on GPU In-memory computing [Patterson et al., IEEE MICRO 97] A Case for Intelligent RAM [D.G. Elliott, D&T 99] Computational RAM [Guo et al., MICRO 11] TCAMs for Data-Intensive Computing
11 Current Work: What We ve Done & Plan To Do Application Design & Characterization. Design & Implementation of VIRAL: VIdeo and Image Retrieval At Large-Scale.
12 Current Work: What We ve Done & Plan To Do Hardware Design. ASIC, Simulation, Design & Test. knn Engine Priority Shift Register Write Interface Sample Data Query Buffer Euclidean Distance Unit Index Buffer Accumulator knn Query Lane Priority Queue Shift Register Read Interface raddr > Memory Controller Write Interface Address Demux Data Arbiter knn Query Lane knn Query Lane knn Query Lane... knn Query Lane Read Interface Input Data > > > r knn Engine
13 Summary: Enabling visual applications via co-design Our Vision: Architectural support to enable high dimensional & high cardinality nearest neighbor search. Input+ System+ Binary++Data k+=+2 Parameters distance+=+l id Output {id,+distance} Our work enables visual applications such as: 3D Reconstruction Content-based Search Activity Recognition
Large-scale visual recognition Efficient matching
Large-scale visual recognition Efficient matching Florent Perronnin, XRCE Hervé Jégou, INRIA CVPR tutorial June 16, 2012 Outline!! Preliminary!! Locality Sensitive Hashing: the two modes!! Hashing!! Embedding!!
More informationCLSH: Cluster-based Locality-Sensitive Hashing
CLSH: Cluster-based Locality-Sensitive Hashing Xiangyang Xu Tongwei Ren Gangshan Wu Multimedia Computing Group, State Key Laboratory for Novel Software Technology, Nanjing University xiangyang.xu@smail.nju.edu.cn
More informationDatasheet. Zetta 4Gbit DDR3L SDRAM. Features VDD=VDDQ=1.35V / V. Fully differential clock inputs (CK, CK ) operation
Zetta Datasheet Features VDD=VDDQ=1.35V + 0.100 / - 0.067V Fully differential clock inputs (CK, CK ) operation Differential Data Strobe (DQS, DQS ) On chip DLL align DQ, DQS and DQS transition with CK
More informationSearching in one billion vectors: re-rank with source coding
Searching in one billion vectors: re-rank with source coding Hervé Jégou INRIA / IRISA Romain Tavenard Univ. Rennes / IRISA Laurent Amsaleg CNRS / IRISA Matthijs Douze INRIA / LJK ICASSP May 2011 LARGE
More informationSDRAM DDR3 512MX8 ½ Density Device Technical Note
SDRAM DDR3 512MX8 ½ Density Device Technical Note Introduction This technical note provides an overview of how the PRN256M8V90BG8RGF-125 DDR3 SDRAM device operates and is configured as a 2Gb device. Addressing
More informationApproximate Nearest Neighbor Search. Deng Cai Zhejiang University
Approximate Nearest Neighbor Search Deng Cai Zhejiang University The Era of Big Data How to Find Things Quickly? Web 1.0 Text Search Sparse feature Inverted Index How to Find Things Quickly? Web 2.0, 3.0
More informationMohsen Imani. University of California San Diego. System Energy Efficiency Lab seelab.ucsd.edu
Mohsen Imani University of California San Diego Winter 2016 Technology Trend for IoT http://www.flashmemorysummit.com/english/collaterals/proceedi ngs/2014/20140807_304c_hill.pdf 2 Motivation IoT significantly
More informationSDRAM DDR3 512MX8 ½ Density Device Technical Note
SDRAM DDR3 512MX8 ½ Density Device Technical Note Introduction This technical note provides an overview of how the XAA512M8V90BG8RGF-SSWO and SSW1 DDR3 SDRAM device is configured and tested as a 2Gb device.
More informationAC-DIMM: Associative Computing with STT-MRAM
AC-DIMM: Associative Computing with STT-MRAM Qing Guo, Xiaochen Guo, Ravi Patel Engin Ipek, Eby G. Friedman University of Rochester Published In: ISCA-2013 Motivation Prevalent Trends in Modern Computing:
More informationDRAM Main Memory. Dual Inline Memory Module (DIMM)
DRAM Main Memory Dual Inline Memory Module (DIMM) Memory Technology Main memory serves as input and output to I/O interfaces and the processor. DRAMs for main memory, SRAM for caches Metrics: Latency,
More informationA Systems View of Large- Scale 3D Reconstruction
Lecture 23: A Systems View of Large- Scale 3D Reconstruction Visual Computing Systems Goals and motivation Construct a detailed 3D model of the world from unstructured photographs (e.g., Flickr, Facebook)
More informationLarge Scale 3D Reconstruction by Structure from Motion
Large Scale 3D Reconstruction by Structure from Motion Devin Guillory Ziang Xie CS 331B 7 October 2013 Overview Rome wasn t built in a day Overview of SfM Building Rome in a Day Building Rome on a Cloudless
More informationThe University of Adelaide, School of Computer Science 13 September 2018
Computer Architecture A Quantitative Approach, Sixth Edition Chapter 2 Memory Hierarchy Design 1 Programmers want unlimited amounts of memory with low latency Fast memory technology is more expensive per
More informationFeatures. DDR3 Registered DIMM Spec Sheet
Features DDR3 functionality and operations supported as defined in the component data sheet 240-pin, Registered Dual In-line Memory Module (RDIMM) Fast data transfer rates: PC3-8500, PC3-10600, PC3-12800
More informationLecture 24: Image Retrieval: Part II. Visual Computing Systems CMU , Fall 2013
Lecture 24: Image Retrieval: Part II Visual Computing Systems Review: K-D tree Spatial partitioning hierarchy K = dimensionality of space (below: K = 2) 3 2 1 3 3 4 2 Counts of points in leaf nodes Nearest
More informationEECS150 - Digital Design Lecture 17 Memory 2
EECS150 - Digital Design Lecture 17 Memory 2 October 22, 2002 John Wawrzynek Fall 2002 EECS150 Lec17-mem2 Page 1 SDRAM Recap General Characteristics Optimized for high density and therefore low cost/bit
More informationApplication Codesign of Near-Data Processing for Similarity Search
Application Codesign of Near-Data for Similarity Search Vincent T. Lee, Amrita Mazumdar, Carlo C. del Mundo, Armin Alaghi, Luis Ceze, Mark Oskin University of Washington {vlee2, amrita, cdel, armin, luisceze,
More informationSDRAM DDR3 256MX8 ½ Density Device Technical Note
SDRAM DDR3 256MX8 ½ Density Device Technical Note Introduction This technical note provides an overview of how the SGG128M8V79DG8GQF-15E DDR3 SDRAM device is configured and tested as a 1Gb device. This
More informationarxiv: v2 [cs.dc] 10 Jul 2017
Application-Driven Near-Data for Similarity Search Vincent T. Lee, Amrita Mazumdar, Carlo C. del Mundo, Armin Alaghi, Luis Ceze, Mark Oskin University of Washington {vlee2, amrita, cdel, armin, luisceze,
More informationImplementing Ultra Low Latency Data Center Services with Programmable Logic
Implementing Ultra Low Latency Data Center Services with Programmable Logic John W. Lockwood, CEO: Algo-Logic Systems, Inc. http://algo-logic.com Solutions@Algo-Logic.com (408) 707-3740 2255-D Martin Ave.,
More informationIN-MEMORY ASSOCIATIVE COMPUTING
IN-MEMORY ASSOCIATIVE COMPUTING AVIDAN AKERIB, GSI TECHNOLOGY AAKERIB@GSITECHNOLOGY.COM AGENDA The AI computational challenge Introduction to associative computing Examples An NLP use case What s next?
More informationLarge scale object/scene recognition
Large scale object/scene recognition Image dataset: > 1 million images query Image search system ranked image list Each image described by approximately 2000 descriptors 2 10 9 descriptors to index! Database
More informationDesign and Implementation of High Performance Application Specific Memory
Design and Implementation of High Performance Application Specific Memory - 고성능 Application Specific Memory 의설계와구현 - M.S. Thesis Sungdae Choi Dec. 20th, 2002 Outline Introduction Memory for Mobile 3D Graphics
More informationChapter 6 (Lect 3) Counters Continued. Unused States Ring counter. Implementing with Registers Implementing with Counter and Decoder
Chapter 6 (Lect 3) Counters Continued Unused States Ring counter Implementing with Registers Implementing with Counter and Decoder Sequential Logic and Unused States Not all states need to be used Can
More informationECE 485/585 Microprocessor System Design
Microprocessor System Design Lecture 4: Memory Hierarchy Memory Taxonomy SRAM Basics Memory Organization DRAM Basics Zeshan Chishti Electrical and Computer Engineering Dept Maseeh College of Engineering
More informationA comparative study of k-nearest neighbour techniques in crowd simulation
A comparative study of k-nearest neighbour techniques in crowd simulation Jordi Vermeulen Arne Hillebrand Roland Geraerts Department of Information and Computing Sciences Utrecht University, The Netherlands
More informationIsometric Mapping Hashing
Isometric Mapping Hashing Yanzhen Liu, Xiao Bai, Haichuan Yang, Zhou Jun, and Zhihong Zhang Springer-Verlag, Computer Science Editorial, Tiergartenstr. 7, 692 Heidelberg, Germany {alfred.hofmann,ursula.barth,ingrid.haas,frank.holzwarth,
More informationLecture 13: k-means and mean-shift clustering
Lecture 13: k-means and mean-shift clustering Juan Carlos Niebles Stanford AI Lab Professor Stanford Vision Lab Lecture 13-1 Recap: Image Segmentation Goal: identify groups of pixels that go together Lecture
More informationDeepIndex for Accurate and Efficient Image Retrieval
DeepIndex for Accurate and Efficient Image Retrieval Yu Liu, Yanming Guo, Song Wu, Michael S. Lew Media Lab, Leiden Institute of Advance Computer Science Outline Motivation Proposed Approach Results Conclusions
More informationECE 152 Introduction to Computer Architecture
Introduction to Computer Architecture Main Memory and Virtual Memory Copyright 2009 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2009 1 Where We Are in This Course
More informationSingle Pass Connected Components Analysis
D. G. Bailey, C. T. Johnston, Single Pass Connected Components Analysis, Proceedings of Image and Vision Computing New Zealand 007, pp. 8 87, Hamilton, New Zealand, December 007. Single Pass Connected
More informationPUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES
PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES Greg Hankins APRICOT 2012 2012 Brocade Communications Systems, Inc. 2012/02/28 Lookup Capacity and Forwarding
More informationECEN 449 Microprocessor System Design. Memories. Texas A&M University
ECEN 449 Microprocessor System Design Memories 1 Objectives of this Lecture Unit Learn about different types of memories SRAM/DRAM/CAM Flash 2 SRAM Static Random Access Memory 3 SRAM Static Random Access
More informationLecture 11: Packet forwarding
Lecture 11: Packet forwarding Anirudh Sivaraman 2017/10/23 This week we ll talk about the data plane. Recall that the routing layer broadly consists of two parts: (1) the control plane that computes routes
More informationEmbedded Systems: Hardware Components (part I) Todor Stefanov
Embedded Systems: Hardware Components (part I) Todor Stefanov Leiden Embedded Research Center Leiden Institute of Advanced Computer Science Leiden University, The Netherlands Outline Generic Embedded System
More informationMark Redekopp, All rights reserved. EE 352 Unit 10. Memory System Overview SRAM vs. DRAM DMA & Endian-ness
EE 352 Unit 10 Memory System Overview SRAM vs. DRAM DMA & Endian-ness The Memory Wall Problem: The Memory Wall Processor speeds have been increasing much faster than memory access speeds (Memory technology
More informationInfineon HYB39S128160CT M SDRAM Circuit Analysis
September 8, 2004 Infineon HYB39S128160CT-7.5 128M SDRAM Circuit Analysis Table of Contents Introduction... Page 1 List of Figures... Page 2 Device Summary Sheet... Page 13 Chip Description... Page 16
More informationFast Nearest Neighbor Search in the Hamming Space
Fast Nearest Neighbor Search in the Hamming Space Zhansheng Jiang 1(B), Lingxi Xie 2, Xiaotie Deng 1,WeiweiXu 3, and Jingdong Wang 4 1 Shanghai Jiao Tong University, Shanghai, People s Republic of China
More informationScalable Nearest Neighbor Algorithms for High Dimensional Data Marius Muja (UBC), David G. Lowe (Google) IEEE 2014
Scalable Nearest Neighbor Algorithms for High Dimensional Data Marius Muja (UBC), David G. Lowe (Google) IEEE 2014 Presenter: Derrick Blakely Department of Computer Science, University of Virginia https://qdata.github.io/deep2read/
More informationMultilevel Memories. Joel Emer Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology
1 Multilevel Memories Computer Science and Artificial Intelligence Laboratory Massachusetts Institute of Technology Based on the material prepared by Krste Asanovic and Arvind CPU-Memory Bottleneck 6.823
More informationENEE 759H, Spring 2005 Memory Systems: Architecture and
SLIDE, Memory Systems: DRAM Device Circuits and Architecture Credit where credit is due: Slides contain original artwork ( Jacob, Wang 005) Overview Processor Processor System Controller Memory Controller
More informationFeatures. DDR3 UDIMM w/o ECC Product Specification. Rev. 14 Dec. 2011
Features DDR3 functionality and operations supported as defined in the component data sheet 240pin, unbuffered dual in-line memory module (UDIMM) Fast data transfer rates: PC3-8500, PC3-10600, PC3-12800
More informationRandom Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search
Random Grids: Fast Approximate Nearest Neighbors and Range Searching for Image Search Dror Aiger, Efi Kokiopoulou, Ehud Rivlin Google Inc. aigerd@google.com, kokiopou@google.com, ehud@google.com Abstract
More informationThe Memory Hierarchy Part I
Chapter 6 The Memory Hierarchy Part I The slides of Part I are taken in large part from V. Heuring & H. Jordan, Computer Systems esign and Architecture 1997. 1 Outline: Memory components: RAM memory cells
More informationA Hardware-Friendly Bilateral Solver for Real-Time Virtual-Reality Video
A Hardware-Friendly Bilateral Solver for Real-Time Virtual-Reality Video Amrita Mazumdar Armin Alaghi Jonathan T. Barron David Gallup Luis Ceze Mark Oskin Steven M. Seitz University of Washington Google
More informationFeatures. DDR2 UDIMM with ECC Product Specification. Rev. 1.2 Aug. 2011
Features 240pin, unbuffered dual in-line memory module (UDIMM) Error Check Correction (ECC) Support Fast data transfer rates: PC2-4200, PC3-5300, PC3-6400 Single or Dual rank 512MB (64Meg x 72), 1GB(128
More informationTechnical Note Designing for High-Density DDR2 Memory
Technical Note Designing for High-Density DDR2 Memory TN-47-16: Designing for High-Density DDR2 Memory Introduction Introduction DDR2 memory supports an extensive assortment of options for the system-level
More informationChapter 8 Memory Basics
Logic and Computer Design Fundamentals Chapter 8 Memory Basics Charles Kime & Thomas Kaminski 2008 Pearson Education, Inc. (Hyperlinks are active in View Show mode) Overview Memory definitions Random Access
More informationLecture-14 (Memory Hierarchy) CS422-Spring
Lecture-14 (Memory Hierarchy) CS422-Spring 2018 Biswa@CSE-IITK The Ideal World Instruction Supply Pipeline (Instruction execution) Data Supply - Zero-cycle latency - Infinite capacity - Zero cost - Perfect
More informationFeatures. DDR2 UDIMM w/o ECC Product Specification. Rev. 1.1 Aug. 2011
Features 240pin, unbuffered dual in-line memory module (UDIMM) Fast data transfer rates: PC2-4200, PC3-5300, PC3-6400 Single or Dual rank 512MB (64Meg x 64), 1GB(128 Meg x 64), 2GB (256 Meg x 64) JEDEC
More informationECE 250 / CS250 Introduction to Computer Architecture
ECE 250 / CS250 Introduction to Computer Architecture Main Memory Benjamin C. Lee Duke University Slides from Daniel Sorin (Duke) and are derived from work by Amir Roth (Penn) and Alvy Lebeck (Duke) 1
More informationMobile LPDDR (only) 152-Ball Package-on-Package (PoP) TI-OMAP MT46HxxxMxxLxCG MT46HxxxMxxLxKZ
Mobile LPDDR (only) 152-Ball Package-on-Package (PoP) TI-OMAP MT6HxxxMxxLxCG MT6HxxxMxxLxKZ 152-Ball x Mobile LPDDR (only) PoP (TI-OMAP) Features Features / = 1.70 1.95V Bidirectional data strobe per byte
More informationApplication Note AN2247/D Rev. 0, 1/2002 Interfacing the MCF5307 SDRAMC to an External Master nc... Freescale Semiconductor, I Melissa Hunter TECD App
Application Note AN2247/D Rev. 0, 1/2002 Interfacing the MCF5307 SDRAMC to an External Master Melissa Hunter TECD Applications This application note discusses the issues involved in designing external
More informationNovel Hardware Architecture for Fast Address Lookups
Novel Hardware Architecture for Fast Address Lookups Pronita Mehrotra Paul D. Franzon Department of Electrical and Computer Engineering North Carolina State University {pmehrot,paulf}@eos.ncsu.edu This
More informationCS427 Multicore Architecture and Parallel Computing
CS427 Multicore Architecture and Parallel Computing Lecture 6 GPU Architecture Li Jiang 2014/10/9 1 GPU Scaling A quiet revolution and potential build-up Calculation: 936 GFLOPS vs. 102 GFLOPS Memory Bandwidth:
More informationBasics DRAM ORGANIZATION. Storage element (capacitor) Data In/Out Buffers. Word Line. Bit Line. Switching element HIGH-SPEED MEMORY SYSTEMS
Basics DRAM ORGANIZATION DRAM Word Line Bit Line Storage element (capacitor) In/Out Buffers Decoder Sense Amps... Bit Lines... Switching element Decoder... Word Lines... Memory Array Page 1 Basics BUS
More informationLarge-Scale Face Manifold Learning
Large-Scale Face Manifold Learning Sanjiv Kumar Google Research New York, NY * Joint work with A. Talwalkar, H. Rowley and M. Mohri 1 Face Manifold Learning 50 x 50 pixel faces R 2500 50 x 50 pixel random
More information1.35V DDR3L SDRAM SODIMM
1.35V DDR3L SDRAM SODIMM MT4KTF12864HZ 1GB MT4KTF25664HZ 2GB 1GB, 2GB (x64, SR) 204-Pin 1.35V DDR3L SODIMM Features Features DDR3L functionality and operations supported as defined in the component data
More informationIntroduction to Video Compression
Insight, Analysis, and Advice on Signal Processing Technology Introduction to Video Compression Jeff Bier Berkeley Design Technology, Inc. info@bdti.com http://www.bdti.com Outline Motivation and scope
More informationPARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH
PARALLEL & DISTRIBUTED DATABASES CS561-SPRING 2012 WPI, MOHAMED ELTABAKH 1 INTRODUCTION In centralized database: Data is located in one place (one server) All DBMS functionalities are done by that server
More informationOrganization Row Address Column Address Bank Address Auto Precharge 256Mx4 (1GB) based module A0-A13 A0-A9 BA0-BA2 A10
GENERAL DESCRIPTION The Gigaram GR2DR4BD-E4GBXXXVLP is a 512M bit x 72 DDDR2 SDRAM high density ECC REGISTERED DIMM. The GR2DR4BD-E4GBXXXVLP consists of eighteen CMOS 512M x 4 STACKED DDR2 SDRAMs for 4GB
More informationLE4ASS21PEH 16GB Unbuffered 2048Mx64 DDR4 SO-DIMM 1.2V Up to PC CL
LE4ASS21PEH 16GB Unbuffered 2048Mx64 DDR4 SO-DIMM 1.2V Up to PC4-2133 CL 15-15-15 General Description This Legacy device is a JEDEC standard unbuffered SO-DIMM module, based on CMOS DDR4 SDRAM technology,
More informationPRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in ReRAM-based Main Memory
Scalable and Energy-Efficient Architecture Lab (SEAL) PRIME: A Novel Processing-in-memory Architecture for Neural Network Computation in -based Main Memory Ping Chi *, Shuangchen Li *, Tao Zhang, Cong
More informationAutomotive DDR3L-RS SDRAM
Automotive DDR3L-RS SDRAM MT4KG4 28 Meg x 4 x 8 banks MT4K52M8 64 Meg x 8 x 8 banks MT4K256M6 32 Meg x 6 x 8 banks 4Gb: x8, x6 Automotive DDR3L-RS SDRAM Description Description The.35R3L-RS SDRAM device
More informationUMBC D 7 -D. Even bytes 0. 8 bits FFFFFC FFFFFE. location in addition to any 8-bit location. 1 (Mar. 6, 2002) SX 16-bit Memory Interface
8086-80386SX 16-bit Memory Interface These machines differ from the 8088/80188 in several ways: The data bus is 16-bits wide. The IO/M pin is replaced with M/IO (8086/80186) and MRDC and MWTC for 80286
More informationChapter 5B. Large and Fast: Exploiting Memory Hierarchy
Chapter 5B Large and Fast: Exploiting Memory Hierarchy One Transistor Dynamic RAM 1-T DRAM Cell word access transistor V REF TiN top electrode (V REF ) Ta 2 O 5 dielectric bit Storage capacitor (FET gate,
More informationLow-Cost Inter-Linked Subarrays (LISA) Enabling Fast Inter-Subarray Data Movement in DRAM
Low-Cost Inter-Linked ubarrays (LIA) Enabling Fast Inter-ubarray Data Movement in DRAM Kevin Chang rashant Nair, Donghyuk Lee, augata Ghose, Moinuddin Qureshi, and Onur Mutlu roblem: Inefficient Bulk Data
More information1.35V DDR3L SDRAM UDIMM
1.35V DDR3L SDRAM UDIMM MT8KTF12864AZ 1GB MT8KTF25664AZ 2GB MT8KTF51264AZ 4GB 1GB, 2GB, 4GB (x64, SR) 240-Pin 1.35V DDR3L UDIMM Features Features DDR3L functionality and operations supported as defined
More informationMemory Challenges. Issues & challenges in memory design: Cost Performance Power Scalability
Memory Devices 1 Memory Challenges Issues & challenges in memory design: Cost Performance Power Scalability 2 Memory - Overview Definitions: RAM random access memory DRAM dynamic RAM SRAM static RAM Volatile
More informationEE414 Embedded Systems Ch 5. Memory Part 2/2
EE414 Embedded Systems Ch 5. Memory Part 2/2 Byung Kook Kim School of Electrical Engineering Korea Advanced Institute of Science and Technology Overview 6.1 introduction 6.2 Memory Write Ability and Storage
More information6.819 / 6.869: Advances in Computer Vision
6.819 / 6.869: Advances in Computer Vision Image Retrieval: Retrieval: Information, images, objects, large-scale Website: http://6.869.csail.mit.edu/fa15/ Instructor: Yusuf Aytar Lecture TR 9:30AM 11:00AM
More informationAddressing the Memory Wall
Lecture 26: Addressing the Memory Wall Parallel Computer Architecture and Programming CMU 15-418/15-618, Spring 2015 Tunes Cage the Elephant Back Against the Wall (Cage the Elephant) This song is for the
More informationWhere We Are in This Course Right Now. ECE 152 Introduction to Computer Architecture. This Unit: Main Memory. Readings
Introduction to Computer Architecture Main Memory and Virtual Memory Copyright 2012 Daniel J. Sorin Duke University Slides are derived from work by Amir Roth (Penn) Spring 2012 Where We Are in This Course
More informationAction Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels
Action Recognition in Video by Sparse Representation on Covariance Manifolds of Silhouette Tunnels Kai Guo, Prakash Ishwar, and Janusz Konrad Department of Electrical & Computer Engineering Motivation
More informationLecture: k-means & mean-shift clustering
Lecture: k-means & mean-shift clustering Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab 1 Recap: Image Segmentation Goal: identify groups of pixels that go together 2 Recap: Gestalt
More informationLecture: k-means & mean-shift clustering
Lecture: k-means & mean-shift clustering Juan Carlos Niebles and Ranjay Krishna Stanford Vision and Learning Lab Lecture 11-1 Recap: Image Segmentation Goal: identify groups of pixels that go together
More informationDDR2 SDRAM UDIMM MT8HTF12864AZ 1GB
Features DDR2 SDRAM UDIMM MT8HTF12864AZ 1GB For component data sheets, refer to Micron's Web site: www.micron.com Figure 1: 240-Pin UDIMM (MO-237 R/C D) Features 240-pin, unbuffered dual in-line memory
More informationDDR2 SDRAM UDIMM MT16HTF25664AZ 2GB MT16HTF51264AZ 4GB For component data sheets, refer to Micron s Web site:
DDR2 SDRAM UDIMM MT16HTF25664AZ 2GB MT16HTF51264AZ 4GB For component data sheets, refer to Micron s Web site: www.micron.com 2GB, 4GB (x64, DR): 240-Pin DDR2 SDRAM UDIMM Features Features 240-pin, unbuffered
More informationTETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory
TETRIS: Scalable and Efficient Neural Network Acceleration with 3D Memory Mingyu Gao, Jing Pu, Xuan Yang, Mark Horowitz, Christos Kozyrakis Stanford University Platform Lab Review Feb 2017 Deep Neural
More informationDDR3 SDRAM UDIMM MT8JTF12864AZ 1GB MT8JTF25664AZ 2GB MT8JTF51264AZ 4GB. Features. 1GB, 2GB, 4GB (x64, SR) 240-Pin DDR3 UDIMM.
DDR3 SDRAM UDIMM MT8JTF12864AZ 1GB MT8JTF25664AZ 2GB MT8JTF51264AZ 4GB 1GB, 2GB, 4GB (x64, SR) 240-Pin DDR3 UDIMM Features Features DDR3 functionality and operations supported as defined in the component
More informationCS650 Computer Architecture. Lecture 9 Memory Hierarchy - Main Memory
CS65 Computer Architecture Lecture 9 Memory Hierarchy - Main Memory Andrew Sohn Computer Science Department New Jersey Institute of Technology Lecture 9: Main Memory 9-/ /6/ A. Sohn Memory Cycle Time 5
More information1.35V DDR3L SDRAM SODIMM
1.35V DDR3L SDRAM SODIMM MT8KTF12864HZ 1GB MT8KTF25664HZ 2GB MT8KTF51264HZ 4GB 1GB, 2GB, 4GB (x64, SR) 204-Pin 1.35V DDR3L SODIMM Features Features DDR3L functionality and operations supported as defined
More informationLecture 15: DRAM Main Memory Systems. Today: DRAM basics and innovations (Section 2.3)
Lecture 15: DRAM Main Memory Systems Today: DRAM basics and innovations (Section 2.3) 1 Memory Architecture Processor Memory Controller Address/Cmd Bank Row Buffer DIMM Data DIMM: a PCB with DRAM chips
More informationSupervised Hashing for Image Retrieval via Image Representation Learning
Supervised Hashing for Image Retrieval via Image Representation Learning Rongkai Xia, Yan Pan, Cong Liu (Sun Yat-Sen University) Hanjiang Lai, Shuicheng Yan (National University of Singapore) Finding Similar
More informationLecture 13: Memory and Programmable Logic
Lecture 13: Memory and Programmable Logic Syed M. Mahmud, Ph.D ECE Department Wayne State University Aby K George, ECE Department, Wayne State University Contents Introduction Random Access Memory Memory
More informationCS 2750: Machine Learning. Clustering. Prof. Adriana Kovashka University of Pittsburgh January 17, 2017
CS 2750: Machine Learning Clustering Prof. Adriana Kovashka University of Pittsburgh January 17, 2017 What is clustering? Grouping items that belong together (i.e. have similar features) Unsupervised:
More information(Advanced) Computer Organization & Architechture. Prof. Dr. Hasan Hüseyin BALIK (5 th Week)
+ (Advanced) Computer Organization & Architechture Prof. Dr. Hasan Hüseyin BALIK (5 th Week) + Outline 2. The computer system 2.1 A Top-Level View of Computer Function and Interconnection 2.2 Cache Memory
More informationRandom Access Memory (RAM)
Random Access Memory (RAM) EED2003 Digital Design Dr. Ahmet ÖZKURT Dr. Hakkı YALAZAN 1 Overview Memory is a collection of storage cells with associated input and output circuitry Possible to read and write
More informationAPPENDIX: DETAILS ABOUT THE DISTANCE TRANSFORM
APPENDIX: DETAILS ABOUT THE DISTANCE TRANSFORM To speed up the closest-point distance computation, 3D Euclidean Distance Transform (DT) can be used in the proposed method. A DT is a uniform discretization
More informationDDR3 SDRAM UDIMM MT8JTF12864AZ 1GB MT8JTF25664AZ 2GB. Features. 1GB, 2GB (x64, SR) 240-Pin DDR3 SDRAM UDIMM. Features
DDR3 SDRAM UDIMM MT8JTF12864AZ 1GB MT8JTF25664AZ 2GB 1GB, 2GB (x64, SR) 240-Pin DDR3 SDRAM UDIMM Features Features DDR3 functionality and operations supported as defined in the component data sheet 240-pin,
More informationAn introduction to SDRAM and memory controllers. 5kk73
An introduction to SDRAM and memory controllers 5kk73 Presentation Outline (part 1) Introduction to SDRAM Basic SDRAM operation Memory efficiency SDRAM controller architecture Conclusions Followed by part
More informationMichael Adler 2017/05
Michael Adler 2017/05 FPGA Design Philosophy Standard platform code should: Provide base services and semantics needed by all applications Consume as little FPGA area as possible FPGAs overcome a huge
More informationMemory and Programmable Logic
Memory and Programmable Logic Memory units allow us to store and/or retrieve information Essentially look-up tables Good for storing data, not for function implementation Programmable logic device (PLD),
More information15-740/ Computer Architecture Lecture 19: Main Memory. Prof. Onur Mutlu Carnegie Mellon University
15-740/18-740 Computer Architecture Lecture 19: Main Memory Prof. Onur Mutlu Carnegie Mellon University Last Time Multi-core issues in caching OS-based cache partitioning (using page coloring) Handling
More informationM2 Outline. Memory Hierarchy Cache Blocking Cache Aware Programming SRAM, DRAM Virtual Memory Virtual Machines Non-volatile Memory, Persistent NVM
M2 Memory Systems M2 Outline Memory Hierarchy Cache Blocking Cache Aware Programming SRAM, DRAM Virtual Memory Virtual Machines Non-volatile Memory, Persistent NVM Memory Technology Memory MemoryArrays
More informationSC64G1A08. DDR3-1600F(CL7) 240-Pin XMP(ver 2.0) U-DIMM 1GB (128M x 64-bits)
SC64G1A08 DDR3-1600F(CL7) 240-Pin XMP(ver 2.0) U-DIMM 1GB (128M x 64-bits) General Description The ADATA s SC64G1A08 is a 128Mx64 bits 1GB(1024MB) DDR3-1600(CL7) SDRAM XMP (ver 2.0) memory module, The
More informationMemory Supplement for Section 3.6 of the textbook
The most basic -bit memory is the SR-latch with consists of two cross-coupled NOR gates. R Recall the NOR gate truth table: A S B (A + B) The S stands for Set to remember, and the R for Reset to remember.
More informationCPE300: Digital System Architecture and Design
CPE300: Digital System Architecture and Design Fall 2011 MW 17:30-18:45 CBC C316 Cache 11232011 http://www.egr.unlv.edu/~b1morris/cpe300/ 2 Outline Review Memory Components/Boards Two-Level Memory Hierarchy
More informationNearest neighbors. Focus on tree-based methods. Clément Jamin, GUDHI project, Inria March 2017
Nearest neighbors Focus on tree-based methods Clément Jamin, GUDHI project, Inria March 2017 Introduction Exact and approximate nearest neighbor search Essential tool for many applications Huge bibliography
More informationStaged Memory Scheduling
Staged Memory Scheduling Rachata Ausavarungnirun, Kevin Chang, Lavanya Subramanian, Gabriel H. Loh*, Onur Mutlu Carnegie Mellon University, *AMD Research June 12 th 2012 Executive Summary Observation:
More information