ALMA Correlator Enhancement

Similar documents
style Click to edit Master text Click to edit Master subtitle style Rodrigo Améstica Click to edit Master text styles

ALMA CORRELATOR : Added chapter number to section numbers. Placed specifications in table format. Added milestone summary.

Digital Correlator and Phased Array Architectures for Upgrading ALMA Alain Baudry & Benjamin Quertier Université de Bordeaux / LAB on behalf of

The Correlator Control Computer

Performance, Power, Die Yield. CS301 Prof Szajda

The challenges of computing at astronomical scale

Up and Running Software The Development Process

2011 Signal Processing CoDR: Technology Roadmap W. Turner SPDO. 14 th April 2011

Outline Marquette University

How do Design a Cluster

Slide Set 8. for ENCM 369 Winter 2018 Section 01. Steve Norman, PhD, PEng

Robert Jamieson. Robs Techie PP Everything in this presentation is at your own risk!

FPGA Technology and Industry Experience

Contents Slide Set 9. Final Notes on Textbook Chapter 7. Outline of Slide Set 9. More about skipped sections in Chapter 7. Outline of Slide Set 9

3D systems-on-chip. A clever partitioning of circuits to improve area, cost, power and performance. The 3D technology landscape

What is this class all about?

Project Overview and Status

5.11 Parallelism and Memory Hierarchy: Redundant Arrays of Inexpensive Disks 485.e1

AMD EPYC Empowers Single-Socket Servers

SKA Computing and Software

ELCT 501: Digital System Design

ECE 471 Embedded Systems Lecture 2

The Use of LabVIEW FPGA in Accelerator Instrumentation.

Computing s Energy Problem:

Visualization & the CASA Viewer

COMPUTER ORGANIZATION AND DESIGN The Hardware/Software Interface. 5 th. Edition. Chapter 1. Computer Abstractions and Technology

LECTURE 1. Introduction

CENG3420 Lecture 08: Memory Organization

7 Trends driving the Industry to Software-Defined Servers

Re-Examining Conventional Wisdom for Networks-on-Chip in the Context of FPGAs

SSD in the Enterprise August 1, 2014

THE VLBA SENSITIVITY UPGRADE

More Course Information

Pipelines! Loránt Sjouwerman. An NSF Facility

Using ASIC circuits. What is ASIC. ASIC examples ASIC types and selection ASIC costs ASIC purchasing Trends in IC technologies

Microelettronica. J. M. Rabaey, "Digital integrated circuits: a design perspective" EE141 Microelettronica

Dr. Evaldas Stankevičius, Regulatory and Security Expert.

ReNoC: A Network-on-Chip Architecture with Reconfigurable Topology

Workload Optimized Systems: The Wheel of Reincarnation. Michael Sporer, Netezza Appliance Hardware Architect 21 April 2013

Building a Fast, Virtualized Data Plane with Programmable Hardware. Bilal Anwer Nick Feamster

What is this class all about?

COMPUTER ORGANIZATION AND DESIGN. 5 th Edition. The Hardware/Software Interface. Chapter 1. Computer Abstractions and Technology

Advanced Computer Architecture

Network Processors and their memory

Impact of DFT Techniques on Wafer Probe

Lecture #1. Teach you how to make sure your circuit works Do you want your transistor to be the one that screws up a 1 billion transistor chip?

Concept Design of a Software Correlator for future ALMA. Jongsoo Kim Korea Astronomy and Space Science Institute

2000 N + N <100N. When is: Find m to minimize: (N) m. N log 2 C 1. m + C 3 + C 2. ESE534: Computer Organization. Previously. Today.

Advanced Multi-Beam Spect rom et er for t he GBT

Reminder. Course project team forming deadline. Course project ideas. Friday 9/8 11:59pm You will be randomly assigned to a team after the deadline

HISTORY OF MICROPROCESSORS

CIT 668: System Architecture. Scalability

Disruptive Innovation in ethernet switching

EITF35: Introduction to Structured VLSI Design

Soft processors as a prospective platform of the future

Embedded System Design

What You Need to Know When Buying a New Computer JackaboutComputers.com

Chapter 1. Introduction

You Probably DO Need RAC

ECE331: Hardware Organization and Design

Shared Risk Observing

"On the Capability and Achievable Performance of FPGAs for HPC Applications"

Computer Architecture Review. ICS332 - Spring 2016 Operating Systems

Lecture 23. Finish-up buses Storage

An Introduction to Business Disaster Recovery

EVLA Correlator P. Dewdney Dominion Radio Astrophysical Observatory Herzberg Institute of Astrophysics

Fast Flexible FPGA-Tuned Networks-on-Chip

ECE 747 Digital Signal Processing Architecture. DSP Implementation Architectures

ALMA Antenna responses in CASA imaging

Findings on MDI Return Loss Measurement GITESH BHAGWAT SANTA BARBARA DESIGN CENTER

FCOE MULTI-HOP; DO YOU CARE?

Lecture 17 Introduction to Memory Hierarchies" Why it s important " Fundamental lesson(s)" Suggested reading:" (HP Chapter

CAD for VLSI. Debdeep Mukhopadhyay IIT Madras

An FPGA Architecture Supporting Dynamically-Controlled Power Gating

APPLICATION NOTE. Application note about SMT702_SMT712 System: 2-Ghz Platform. SMT702 and SMT712 SUNDANCE MULTIPROCESSOR TECHNOLOGY LTD.

Asynchronous on-chip Communication: Explorations on the Intel PXA27x Peripheral Bus

EE586 VLSI Design. Partha Pande School of EECS Washington State University

AMD Disaggregates the Server, Defines New Hyperscale Building Block

Module 18: "TLP on Chip: HT/SMT and CMP" Lecture 39: "Simultaneous Multithreading and Chip-multiprocessing" TLP on Chip: HT/SMT and CMP SMT

Multi-Screen Computer Buyers Guide. // //

CENG4480 Lecture 09: Memory 1

QUALCOMM: Company Overview and Opportunities for Students and Collaboration. Junyi Li Vice President of Technology QUALCOMM

From Complicated to Simple with Single-Chip Silicon Clock Generation

PC I/O. May 7, Howard Huang 1

MICRO DIGITAL: TECHNICAL CRITERIA FOR MAKING THE RTOS CHOICE

The Impact of Optics on HPC System Interconnects

Networking for a dynamic infrastructure: getting it right.

Computer Performance

An Architecture for Future Configurable Millimeter Wave Networks. Hang Liu

PUSHING THE LIMITS, A PERSPECTIVE ON ROUTER ARCHITECTURE CHALLENGES

Software Defined Networking

CMS FPGA Based Tracklet Approach for L1 Track Finding

ECE 637 Integrated VLSI Circuits. Introduction. Introduction EE141

The Computer Revolution. Classes of Computers. Chapter 1

SEE Tolerant Self-Calibrating Simple Fractional-N PLL

ESE (ESE534): Computer Organization. Today. Soft Errors. Soft Error Effects. Induced Soft Errors. Day 26: April 18, 2007 Et Cetera

Computer Architecture. Fall Dongkun Shin, SKKU

NanoScale Storage Systems Inc.

(Refer Slide Time: 00:01:30)

Please view notes for further information on later slides

Transcription:

ALMA Correlator Enhancement Technical Perspective Rodrigo Amestica, Ray Escoffier, Joe Greenberg, Rich Lacasse, J Perez, Alejandro Saez Atacama Large Millimeter/submillimeter Array Karl G. Jansky Very Large Array Robert C. Byrd Green Bank Telescope Very Long Baseline Array

Outline Motivation Short technical description Summary of performance gains Cost guesstimates 2

Motivation Potential benefits to science Technology has evolved Over a decade since hardware was designed Order of magnitude improvement possible without major disruption?? Moore s Law 3

Some Technical Details - cost versus design node There are significant trade-offs in designing a new ASIC Cost versus design node: (source: EE Times web blog, FPGA as ASIC alternative, April 21, 2014) 4

Some Technical Details - example trade-offs Power versus design node (one example): (source: Nick Kepler, SuVolta Corp. Rethinking the Pursuit of Moore s Law) 5

Some Technical Details - change in available technologies Industry as a whole is a mixed bag (source: EE Times web blog, FPGA as ASIC alternative, April 21, 2014) 6

Some Technical Details - finding the sweet spot Have talked with various experts to try to find sweet spot (reasonable cost and power with significant performance enhancement) From what we ve learned so far, a factor of 32 density improvement with a factor of 8 resolution improvement at the 45/40 nm node seems to be the sweet spot. Preliminary quotes: Part POWER EST NRE Parts Cost Quan Total Custom Chip (esilicon, 40nm) 18.5 W $2,275,000 $30 10,000 $2,575,000 Custom Chip (esilicon, 28nm) 17 W $3,175,000 $20 10,000 $3,375,000 Custom Chip (isine) 2.6-3.4 W $2,000,000 $50 10,000 $2,500,000 Custon Chiop (STM, 40nm) 6.4/3.45 W $1,377,600 $83 10,000 $2,206,400 7

Some Technical Details The Big Picture and Main Thrust of the Proposal: Resolution can be enhanced by replacing custom correlator Integrated Circuit (IC) It follows that we must replace the correlator cards and everything downstream to handle more data. Provide detailed plan and costing Bandwidth can be enhanced by doubling the clock rate on most cards and hoping the backplane can handle the resulting traffic. Requires replacing cards in data path in front of correlator card. Provide road map (i.e., not as much detail!) 8

Some Technical Details architecture ALMA-1 ALMA-2 9

Some Technical Details - number of cards goes down dramatically CURRENT (4 cards per plane ) PROPOSED (1 card per plane ) 10

Some Technical Details - architecture and data rates 11

Road map to bandwidth enhancement Current chips run at 125 MHz. New chips could be designed to run at > 250 MHz This would be part of what is needed to double the correlator bandwidth. However, the front end of the correlator (Station Racks) would also need to be redesigned to make this possible. It is not clear, at this time, whether or not this is affordable We are currently testing whether or not the existing backplanes can run at 250 MHz. This would allow the upgrade to involve only changing cards and a few cables. Having to replace the motherboards would be more expensive and disruptive. 12

Impact on ALMA Operations Resolution enhancement is not a major perturbation to ALMA operations There are significant challenges in chip, firmware, and software design and test, but these can mostly be done off-line. No racks to rip out No change in power and clock distribution No change in cooling requirements From the hardware point of view, it s mostly a matter of swapping cards and adding some cables. 13

Performance Enhancements - frequency and time resolution, and sensitivity X8 enhancement in frequency resolution. Spectral resolution for every FDM mode increases by a factor of eight for integration times of 128 msec and greater. For example, from Table 2 of the correlator specification (next slide) (http://edm.alma.cl/forums/alma/dispatch.cgi/documents/docprofile/100591) Time resolution enhancement for auto and cross products And if you don t need either improved frequency or time resolution, there is still something in it for you: higher sensitivity or shorter observing times can be obtained using 4-bit x 4-bit modes and/or double Nyquist mode (95% efficiency versus 85% including the effect of the 3-bit sampler). This is equivalent to adding about 8 antennas to the array or cutting integration times down by 12%! True only for BW < 2 GHz. See comparison of modes 2 and 53 on the next slide 14

Mode Table Changes Mode # Number of subchannel filters Total Bandwidth Number of Spectral Points Spectral Resolution (KHz) Correlation Sampling Sensitivity (x 0.96) Current Proposed Current Proposed 1 32 2 GHz 8192 65536 244 30.5 2-bit x 2-bit Nyquist 0.88 19 32 2 GHz 4096 32768 488 61 2-bit x 2-bit Twice Nyquist 0.94 38 32 2 GHz 2048 16384 976 122 4-bit x 4-bit Nyquist 0.99 2 16 1 GHz 8192 65536 122 15.25 2-bit x 2-bit Nyquist 0.88 20 16 1 GHz 4096 32768 244 30.5 2-bit x 2-bit Twice Nyquist 0.94 39 16 1 GHz 2048 16384 488 61 4-bit x 4-bit Nyquist 0.99 53 16 1 GHz 1024 8192 976 122 4-bit x 4-bit Twice Nyquist 0.99 3 8 500 MHz 8192 65536 61 7.625 2-bit x 2-bit Nyquist 0.88 21 8 500 MHz 4096 32768 122 15.25 2-bit x 2-bit Twice Nyquist 0.94 40 8 500 MHz 2048 16384 244 30.5 4-bit x 4-bit Nyquist 0.99 54 8 500 MHz 1024 8192 488 61 4-bit x 4-bit Twice Nyquist 0.99 4 4 250 MHz 8192 65536 30 3.75 2-bit x 2-bit Nyquist 0.88 22 4 250 MHz 4096 32768 61 7.625 2-bit x 2-bit Twice Nyquist 0.94 41 4 250 MHz 2048 16384 122 15.25 4-bit x 4-bit Nyquist 0.99 55 4 250 MHz 1024 8192 244 30.5 4-bit x 4-bit Twice Nyquist 0.99 5 2 125 MHz 8192 65536 15 1.875 2-bit x 2-bit Nyquist 0.88 23 2 125 MHz 4096 32768 30 3.75 2-bit x 2-bit Twice Nyquist 0.94 42 2 125 MHz 2048 16384 61 7.625 4-bit x 4-bit Nyquist 0.99 56 2 125 MHz 1024 8192 122 15.25 4-bit x 4-bit Twice Nyquist 0.99 6 1 62.5 MHz 8192 65536 7.6 0.95 2-bit x 2-bit Nyquist 0.88 24 1 62.5 MHz 4096 32768 15 1.875 2-bit x 2-bit Twice Nyquist 0.94 43 1 62.5 MHz 2048 16384 30 3.75 4-bit x 4-bit Nyquist 0.99 57 1 62.5 MHz 1024 8192 61 7.625 4-bit x 4-bit Twice Nyquist 0.99 25 1 31.25 MHz 8192 65536 3.8 0.475 2-bit x 2-bit Twice Nyquist 0.94 58 1 31.25 MHz 2048 16384 15 1.875 4-bit x 4-bit Twice Nyquist 0.99 15

Performance Enhancements - time resolution details The current implementation has time resolution of 1 msec for auto products and 16 msec for cross-products. The addition of RAM to the correlator chip makes it possible to trade time resolution for spectral resolution on auto and cross-products, a new feature (is this useful??): Time Resolution (msec) Spectral Points (per baseband) 1 512 2 1024 4 1024 8 2048 16 4096 32 8192 64 16384 128 32768 256 65536 512 65536 1024 65536 16

Costs, very preliminary Big ticket item is surely the cost of the first chip (NRE), ~$2.5M. But it s cheaper than buying a ton of big FPGAs which, at first seems like the obvious solution. Hardware costs in addition to ships ~ $1.25M Labor about 4 FTE-years (hardware + software) Contingency + overhead Bottom line is that it could be done with a fraction of ALMA development funds over 2 to 3 years. 17

Summary We are proposing to study an upgrade of the ALMA correlator. The upgrade would provide a factor of eight improvement in spectral resolution as well as improvements in time resolution. Improved spectral resolution could be traded for improved sensitivity. We will also provide a roadmap to doubling the bandwidth We have applied for an ALMA study proposal in this call. We would be interested in some feedback from the scientific community. Further reading: Enhancing the Performance of the 64-antenna ALMA Correlator (Escoffier, Lacasse, Saez, John Webber, Rodrigo Amestica, Alain Baudry) http://library.nrao.edu/public/memos/naasc/naasc_114.pdf 18

Backup Slides 19

Motivation Technology has evolved Over a decade since hardware was designed Order of magnitude improvement possible without major disruption?? Moore s Law 20

Some Technical Details - cost versus design node There are significant trade-offs in designing a new ASIC Cost versus design node: (source: EE Times web blog, FPGA as ASIC alternative, April 21, 2014) 21

Some Technical Details - example trade-offs Power versus design node (one example): (source: Nick Kepler, SuVolta Corp. Rethinking the Pursuit of Moore s Law) 22

Some Technical Details - change in available technologies Industry as a whole is a mixed bag (source: EE Times web blog, FPGA as ASIC alternative, April 21, 2014) 23