DEEPFINNS. Group 3 Members FPGA IMPLEMENTATION OF A DEEP NEURAL NETWORK FOR SPEECH RECOGNITION. Fall 2016 LINDSAY DAVIS

Similar documents
iphone Noise Filtration Hardware

Implementing the Top Five Control-Path Applications with Low-Cost, Low-Power CPLDs

Lab 1 Introduction to Microcontroller

Variable Frequency Drive Wireless Interface Prototype Project Proposal

Physics 536 Spring Illustrating the FPGA design process using Quartus II design software and the Cyclone II FPGA Starter Board.

CSC 170 Introduction to Computers and Their Applications. Computers

FPGAs: FAST TRACK TO DSP

Advanced Digital Design Using FPGA. Dr. Shahrokh Abadi

Plug and Play Home Automation with Alternative Control

Versatile Link and GBT Chipset Production

discrete logic do not

WHAT ARE MOBILE PHONE SHOPPERS SEARCHING ONLINE?

NIOS II Instantiating the Off-chip Trace Logic

NIOS CPU Based Embedded Computer System on Programmable Chip

Introduction to Embedded Systems

Using the Sensory NLP-5x LCD Module and LCD Sample

ECE 4514 Digital Design II. Spring Lecture 22: Design Economics: FPGAs, ASICs, Full Custom

FPGA system development What you need to think about. Frédéric Leens, CEO

Chapter 5: ASICs Vs. PLDs

Choosing an Intellectual Property Core

Employing Multi-FPGA Debug Techniques

ECE 588/688 Advanced Computer Architecture II

Implementation of Deep Convolutional Neural Net on a Digital Signal Processor

DARUL ARQAM Scope & Sequence Mathematics Grade 5. Revised 05/31/ st Quarter (43 Days) Resources: 1 : Aug 9-11 (3 days)

Unit 4 Part A Evaluating & Purchasing a Computer. Computer Applications

SDACCEL DEVELOPMENT ENVIRONMENT. The Xilinx SDAccel Development Environment. Bringing The Best Performance/Watt to the Data Center

FPGA Adaptive Software Debug and Performance Analysis

A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models

ECE 1160/2160 Embedded Systems Design. Midterm Review. Wei Gao. ECE 1160/2160 Embedded Systems Design

Memory Systems IRAM. Principle of IRAM

Digital Systems Design. System on a Programmable Chip

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

FPGA Power Management and Modeling Techniques

ICS 52: Introduction to Software Engineering

The simplest form of storage is a register file. All microprocessors have register files, which are known as registers in the architectural context.

Experiment 3. Digital Circuit Prototyping Using FPGAs

SpiNNaker a Neuromorphic Supercomputer. Steve Temple University of Manchester, UK SOS21-21 Mar 2017

SDA: Software-Defined Accelerator for Large- Scale DNN Systems

Advanced FPGA Design Methodologies with Xilinx Vivado

AI Requires Many Approaches

Nigerian Telecommunications Sector

Maximizing Server Efficiency from μarch to ML accelerators. Michael Ferdman

Choices when it comes to your communications infrastructure A BUYER S GUIDE TO IP-BASED SOLUTIONS

Intel Arria 10 FPGA Performance Benchmarking Methodology and Results

HOME :: FPGA ENCYCLOPEDIA :: ARCHIVES :: MEDIA KIT :: SUBSCRIBE

Board Mounted. Power Converters. Digitally Controlled. Technical Paper 011 Presented at Digital Power Europe 2007

CMPE 415 Programmable Logic Devices Introduction

Microprocessors, Lecture 1: Introduction to Microprocessors

Reduce Your System Power Consumption with Altera FPGAs Altera Corporation Public

Chapter 2. Literature Survey. 2.1 Remote access technologies

Abbas El Gamal. Joint work with: Mingjie Lin, Yi-Chang Lu, Simon Wong Work partially supported by DARPA 3D-IC program. Stanford University

GLAST Large Area Telescope:

Multi-threading technology and the challenges of meeting performance and power consumption demands for mobile applications

Overview of Microcontroller and Embedded Systems

Agilent Technologies InfiniiVision MSO N5434A FPGA Dynamic Probe for Altera

FPGA introduction 2008

Offering compact implementation of sophisticated, high-performance telematics products and industrial equipment, and short development times

Can FPGAs beat GPUs in accelerating next-generation Deep Neural Networks? Discussion of the FPGA 17 paper by Intel Corp. (Nurvitadhi et al.

White Paper Performing Equivalent Timing Analysis Between Altera Classic Timing Analyzer and Xilinx Trace

GROUP 17 COLLEGE OF ENGINEERING & COMPUTER SCIENCE. Senior Design I Professor: Dr. Samuel Richie UNIVERSITY OF CENTRAL FLORIDA

CONSIDERATIONS FOR THE DESIGN OF A REUSABLE SOC HARDWARE/SOFTWARE

ECE 588/688 Advanced Computer Architecture II

1. Designing a 64-word Content Addressable Memory Background

Embedded Systems: Hardware Components (part I) Todor Stefanov

Chapter 2 Parallel Hardware

White Paper Low-Cost FPGA Solution for PCI Express Implementation

Cover TBD. intel Quartus prime Design software

FPGA Implementation and Validation of the Asynchronous Array of simple Processors

Multi Cycle Implementation Scheme for 8 bit Microprocessor by VHDL

ECE 471 Embedded Systems Lecture 2

Intel Stratix 10 Analog to Digital Converter User Guide

Embedded Systems Dr. Santanu Chaudhury Department of Electrical Engineering Indian Institute of Technology, Delhi. Lecture - 10 System on Chip (SOC)

Ten Reasons to Optimize a Processor

ADDENDUM #2 TO REQUEST FOR PROPOSAL (RFP) #2252 Institutional Network Service, Data Center Co-location Service, and High-Speed Internet Service

DINI Group. FPGA-based Cluster computing with Spartan-6. Mike Dini Sept 2010

Reminder. Course project team forming deadline. Course project ideas. Next milestone

Re: ENSC 440 Project Proposal Voice Recognition System in an MP3 Player

Using ASIC circuits. What is ASIC. ASIC examples ASIC types and selection ASIC costs ASIC purchasing Trends in IC technologies

4. Configuring Cyclone II Devices

FPGA Technology and Industry Experience

HETEROGENEOUS COMPUTE INFRASTRUCTURE FOR SINGAPORE

CHAPTER 1 Introduction of the tnano Board CHAPTER 2 tnano Board Architecture CHAPTER 3 Using the tnano Board... 8

L2: FPGA HARDWARE : ADVANCED DIGITAL DESIGN PROJECT FALL 2015 BRANDON LUCIA

DSP using Labview FPGA. T.J.Moir AUT University School of Engineering Auckland New-Zealand

World s most advanced data center accelerator for PCIe-based servers

Advanced course on Embedded Systems design using FPGA

Laboratory Exercise 8

CMPSC 311- Introduction to Systems Programming Module: Caching

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

When addressing VLSI design most books start from a welldefined

Calibrating Achievable Design GSRC Annual Review June 9, 2002

LECTURE 1. Introduction

HAND GESTURE RECOGNITION USING MEMS SENSORS

Proposal for Team 8 (Smart Phone Control of Advanced Sensor Systems) Executive Summary

What is Information Technology. Chapter 1: Computer System. Why use Information Technology? What is Data?

S2C K7 Prodigy Logic Module Series

An Introduction to Programmable Logic

Handouts. FPGA-related documents

FPGAs Provide Reconfigurable DSP Solutions

DARPA-BAA Hierarchical Identify Verify Exploit (HIVE) Frequently Asked Questions (FAQ) August 18, 2016

Transcription:

Fall 2016 DEEPFINNS FPGA IMPLEMENTATION OF A DEEP NEURAL NETWORK FOR SPEECH RECOGNITION Group 3 Members LINDSAY DAVIS Electrical Engineering MICHAEL LOPEZ-BRAU Electrical Engineering ESTELLA GONG Computer Engineering CEDRIC ORBAN Electrical Engineering

Initial Project and Group Identification 1 Introduction People have the remarkable ability of being able to make vast inferences about the world with very little information, as well as being able to communicate these inferences to others. How are humans able to effectively use language, an unfathomably flexible and informal tool, to clearly exchange ideas and develop new ones? Is it possible to train machines to communicate with us in this way? These questions and many like them have piqued the interest of many computer scientists and engineers, leading to research thrusts in computational linguistics and natural language processing. Recent advances in hardware have caused a resurgence of development in deep neural network modelling (DNN), now commonly known as deep learning. In the past, neural network models with many hidden layers were seen as inferior due to their computational complexity but now many of the most common algorithms can be run on a middle-to-high end laptop. Deep learning approaches have become very popular as of late due to their ability to dominate other machine learning algorithms on almost every benchmark. We have noticed that these algorithms also perform exceptionally well in various linguistic domains, such as speech recognition, a subset of computational linguistics. In this project, we focus on the design and applications of speech recognition, particularly in the domain of human-computer interfaces. Speech recognition systems generally deal with several important steps, namely: 1. Data acquisition and pre-processing 2. Recognition/decoding 3. Application interface (e.g. moving a prosthetic arm) Conventional speech recognition systems use hidden Markov models (HMM) in their recognition/decoding step. With the popularity and success of deep learning, we would like to design a system that mirrors current research thrusts by implementing a deep learning framework in our speech recognition system, specifically on an FPGA chip. FPGA chips have the advantage of being significantly cheaper than ASIC variants for small-scale applications. They also have the ability to run a neural network faster than a CPU and with less power consumption than a GPU. We would our speech recognition system to have a small vocabulary of pre-registered words that the user can utilize to get started, with the possibility of adding their own words in the future. Our motivation for this approach comes from the desire to design a speech recognition system that is affordable, energy-efficient, low-cost, and portable while being able to maintain the computational sophistication needed for keeping the word error rates as low as possible. More generally, our motivation for this project stems from its interdisciplinary nature: speech recognition technologies often employ tools from a myriad of disciplinary areas, such as computer hardware, linear algebra, signal processing, and machine learning to name a few. Though our application is not solidified, ultimately, we aim to create an FPGA alternative to the typical CPU/GPU setups that are often used in machine learning applications.

2 Requirements and Constraints 2.1 Money We will need to purchase two major components at the least. Each of these components can be very expensive. We plan on applying for multiple project sponsors in order to reach our fundraising goal and be able to purchase all the necessary parts. SoarTech is one of the sponsors we plan to reach out to. We will have to wrestle with using smaller, less expensive components while trying to maintain accurate speech recognition. 2.2 Time The largest challenge to this project is the hard deadline for completion. We only have 7 months to complete this project. We will need to work a lot on our own and keep up with our own share of the workload in order to finish the project with a promising deliverable. Efficiently scheduling our time and carefully planning project steps will help us succeed. 2.3 FPGA The size of our DNN is directly proportional to the amount of logic we have available for use on the chip. Using a bigger chip implies a bigger neural net and higher recognition accuracy at the expense of actual currency. Using an FPGA entails routing many more traces on our PCB than a typical microprocessor would make use of. These traces could belong to programming interfaces such as JTAG and USB and connect to break-out pins to serve as general purpose IO (GPIO) pins. This factor directly affects the number of layers needed to implement our design as well as the width and height of the board itself. We will need to examine our project carefully to determine what connections are actually needed in order to keep costs down. Additionally, some vendors only provide a free programming environment for some specific models of their FPGAs. For example, Altera only provides synthesis and fitting functionality in their Quartus Prime tool for their lower-end FPGAs (the Cyclone series and some Arria models). 3 Specifications Hardware o PCB with dedicated memory and power supply o PCB shall have inputs for: FPGA USB JTAG Microphone o FPGA will need to process a certain amount of voice information o Memory will need be able to store the weights of the DNN (or the parameters for the HMM) as well as a small preset vocabulary o High-quality microphone

Engineering Requirements Signal Integrity Performance Power Consumption Accuracy 7) Cost Software o Speech signal data set for training the DNN o Command set for when it should start deciphering voice signals o Detect and filter out any background noise 4 House of Quality Table Marketing Requirements + + + + + Signal Integrity + FPGA Size + Performance + Power Consumption + PCB Dimensions + Accuracy + Cost - Signal Integrity - The fidelity and signal-to-noise ratio of the speech signal after signal processing has been performed. FPGA Size The amount of logic available for use on the FPGA. Cost typically goes up as this the number of logic resources increase. This directly correlates to the size of our DNN and whether or not signal processing will be performed on the FPGA. Performance How quickly and how efficiently our device can decode speech signals. This depends on how fast we clock the logic in our FPGA, which in turn depends on the model of FPGA we use and what kind of logic we perform (i.e. whether we use DSP or not and whether we use a DNN or HMM).

Accuracy The ability of our device to correctly decode speech signals. We aim to have a low error rate. 5 System Block Diagram Our project can be broken down into 4 parts: 1. Acoustic input, which involves hardware (mic) and pre-processing (filtering, maybe some DSP). 2. Recognition, or decoding stage. This includes developing the DNN/HMM model and coding it on the FPGA. 3. Designing the PCB for the FPGA, mic, and any other peripherals we decide to use, like USB ports. 4. Application we decide to use the speech input on. Primary Responsibility:

Our final PCB would look similar to: 6 Financing 6.1 Budget This is a very rough outline of our projected budget. The final amount we spend will depend on whether we receive corporate sponsorship and the amount the group itself agrees to contribute to the final design. The breakdown goes as follows:

FPGA Chip $300 - $400 Onboard DSP $80 - $90 Onboard Processor $10 - $20 PCB Printing Service $60 - $70 PCB Pick and Place Service $50 - $100 Various Electronic Components (EEPROM, $30 - $50 DRAM, ADC etc) These parts and values are vague because we have not yet determined exactly what will be needed to satisfy our goals. For example, the price and model of the FPGA is dependent on whether or not we perform speech recognition using neural nets or Hidden Markov Models. Some of the components listed may not be used in the project at all. After more research we may determine that it is advantageous to perform digital signal processing on the FPGA itself, freeing us from purchasing a discrete DSP integrated circuit. However, it is clear that some purchases are necessary. These include the PCB printing and pick and place services. 6.2 Sponsors We plan on submitting a proposal to SoarTech, State Farm, Duke Energy, SAIC, and Leidos to see if there is any potential in having our project sponsored by them. If sponsorship is not feasible, our team will be bootstrapping the entire project development.

7 Milestones Milestones Sept. Oct. Nov. Dec. Jan. Feb. Mar. Apr. Research Phase Finalize Application Idea FPGA Machine Learning Methods PCB Design Embedded Programming Speech Recognition Design and Documentation Phase Application Interface PCB interfacing with FPGA PCB interfacing with Microphone PCB interfacing with other peripherals Acoustic Input Recognition Stage Implementation Phase Acoustic Hardware Acoustic Processing Application Software Recognition Stage Software Recognition Stage Hardware (PCB s etc.) Integration and Testing Phase Acoustic Integration Acoustic Testing PCB Integration PCB Testing FPGA Integration FPGA Testing Recognition/Machine Learning Testing Recognition Stage Integration Entire System Integration Entire System Testing