High Performance and Highly Reliable SSD

Similar documents
B4-Flash for Tier 0 SSD Applications

NAND Flash Memory. Jinkyu Jeong Computer Systems Laboratory Sungkyunkwan University

NAND Flash Basics & Error Characteristics

CS311 Lecture 21: SRAM/DRAM/FLASH

New Architecture for Code-Shadowing Applications. Anil Gupta Technical Executive, Winbond Electronics

From Silicon to Solutions: Getting the Right Memory Mix for the Application

Designing Enterprise SSDs with Low Cost Media

Five Key Steps to High-Speed NAND Flash Performance and Reliability

Flash memory talk Felton Linux Group 27 August 2016 Jim Warner

Embedded 28-nm Charge-Trap NVM Technology

3D Xpoint Status and Forecast 2017

Advanced Information Storage 11

Flash Trends: Challenges and Future

Will Phase Change Memory (PCM) Replace DRAM or NAND Flash?

Virtual Storage Tier and Beyond

The Evolving NAND Flash Business Model for SSD

COMPUTER ARCHITECTURE

Memory technology and optimizations ( 2.3) Main Memory

Flash Memory. Gary J. Minden November 12, 2013

Flash TOSHIBA TOSHIBA

16:30 18:00, June 20 (Monday), 2011 # (even student IDs) # (odd student IDs) Scope

Using MLC Flash to Reduce System Cost in Industrial Applications

Don t Forget the Memory. Dean Klein, VP Memory System Development Micron Technology, Inc.

Optimize your system designs using Flash memory

SD series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

MIRA M2320 series Halfslim SSD Specification. v0.1

Memory Device Evolution

Could We Make SSDs Self-Healing?

SD series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Advanced Flash Technology Status, Scaling Trends & Implications to Enterprise SSD Technology Enablement

DMP SATA DOM SDM-4G-V SDM-8G-V SDM-16G-V

Designing with External Flash Memory on Renesas Platforms

Flash Memory Overview: Technology & Market Trends. Allen Yu Phison Electronics Corp.

Solid State Drives (SSDs) Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

NAND Flash Module 69F64G16 69F128G16 69F256G16 FEATURES: Preliminary

Transitioning from e-mmc to UFS: Controller Design. Kevin Liu ASolid Technology Co., Ltd.

MANAGING MULTI-TIERED NON-VOLATILE MEMORY SYSTEMS FOR COST AND PERFORMANCE 8/9/16

The Memory Hierarchy 1

Flash Memories. Ramin Roosta Dept. of Computer Engineering. EE 595 EDA / ASIC Design Lab

Developing Low Latency NVMe Systems for HyperscaleData Centers. Prepared by Engling Yeo Santa Clara, CA Date: 08/04/2017

Versatile RRAM Technology and Applications

Key Points. Rotational delay vs seek delay Disks are slow. Techniques for making disks faster. Flash and SSDs

Toward a Memory-centric Architecture

A Novel On-the-Fly NAND Flash Read Channel Parameter Estimation and Optimization

Markets for 3D-Xpoint Applications, Performance and Revenue

The Many Flavors of NAND and More to Come

Vulnerabilities in MLC NAND Flash Memory Programming: Experimental Analysis, Exploits, and Mitigation Techniques

Information Storage and Spintronics 10

NAND Controller Reliability Challenges

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Optimizing Your Memory Sub-System

Datasheet. Embedded Storage Solutions. Industrial. SATA III 2.5 Solid State Drive. SED2FIV Series CSXXXXXRC-XX09XXXX. 04 Jul 2016 V1.

SSD Server Hard Drives for Dell

QLC Challenges. QLC SSD s Require Deep FTL Tuning Karl Schuh Micron. Flash Memory Summit 2018 Santa Clara, CA 1

SSD Server Hard Drives for IBM

Middleware and Flash Translation Layer Co-Design for the Performance Boost of Solid-State Drives

1MG3-P Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Don t Forget the Memory. Dean Klein, VP Advanced Memory Solutions Micron Technology, Inc.

3SE4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Tri-Hybrid SSD with storage class memory (SCM) and MLC/TLC NAND Flash Memories

PCIe Storage Beyond SSDs

NAND Flash Architecture and Specification Trends

Hardware NVMe implementation on cache and storage systems

3ME4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Macrotron Systems, Inc. Flash Media Products. Part Number Family: 2.5 PATA (IDE) SSD (High Performance Series) PA25SMXXXXXMXXMX

FLASH DATA RETENTION

Novel Nonvolatile Memory Hierarchies to Realize "Normally-Off Mobile Processors" ASP-DAC 2014

How Good Is Your Memory? An Architect s Look Inside SSDs

NAND Flash Memory. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

Data Retention in MLC NAND Flash Memory: Characterization, Optimization, and Recovery

SD 3.0 series (MLC) Customer Approver. Approver. Customer: Customer Part Number: Innodisk Part Number: Model Name: Date:

Disks and RAID. CS 4410 Operating Systems. [R. Agarwal, L. Alvisi, A. Bracy, E. Sirer, R. Van Renesse]

NAND Flash: Where we are, where are we going?

ECEN 449 Microprocessor System Design. Memories

3ME4 Series. Customer Approver. Innodisk Approver. Customer: Customer Part Number: Innodisk Part Number: Innodisk Model Name: Date:

Selecting a Flash Memory Solution for Embedded Applications

Content courtesy of Wikipedia.org. David Harrison, CEO/Design Engineer for Model Sounds Inc.

Innovations in Non-Volatile Memory 3D NAND and its Implications May 2016 Rob Peglar, VP Advanced Storage,

Module 6: INPUT - OUTPUT (I/O)

Solid State Drives Moving into Design

MS800 Series. msata Solid State Drive Datasheet. Product Feature Capacity: 32GB,64GB,128GB,256GB,512GB Flash Type: MLC NAND FLASH

It Takes Guts to be Great

NAND Flash-based Storage. Jin-Soo Kim Computer Systems Laboratory Sungkyunkwan University

The Evolving NAND Flash Business Model for SSD. Steffen Hellmold VP BD, SandForce

NAND Flash Memory: Basics, Key Scaling Challenges and Future Outlook. Pranav Kalavade Intel Corporation

Toward Seamless Integration of RAID and Flash SSD

Industrial Flash Module Series

Computer Organization. 8th Edition. Chapter 5 Internal Memory

Alternative Erase Verify : The Optimization for Longer Data Retention of NAND FLASH Memory

MLC 2.5 SATA III SSD

Overview of Persistent Memory FMS 2018 Pre-Conference Seminar

The impact of 3D storage solutions on the next generation of memory systems

HLNAND: A New Standard for High Performance Flash Memory

3D NAND - Data Recovery and Erasure Verification

Reducing Solid-State Storage Device Write Stress Through Opportunistic In-Place Delta Compression

Storage Systems : Disks and SSDs. Manu Awasthi July 6 th 2018 Computer Architecture Summer School 2018

3ME3 Series. Customer Approver. InnoDisk Approver. Customer: Customer Part Number: InnoDisk Part Number: InnoDisk Model Name: Date:

Virtium eusb Flash Module. Product Specification VUS 200 Series

A Prototype Storage Subsystem based on PCM

Transcription:

High Performance and Highly Reliable SSD -Proposal of the Fastest Storage with B4-Flash - Moriyoshi Nakashima GENUSION,Inc http://www.genusion.co.jp/ info@genusion.co.jp Santa Clara, CA 1

Big Data comes from tons of Small Data - High speed computing of Big Data can be based on ultra speed processing of Small Data. How small? ; 32B, 128B, 256B, 512B or others How Fast? ; Latency 100 ns Throughput 1GB-6GB/sec How Frequent? ; IOPS > 1000K Non Volatile? ; Yes, of course! - How to correspond to these kinds of requirements? B4-Flash Memory can offer excellent features for Small Data processing. Random read latency ; 100 ns Re-write performance ; equivalent to that of NAND Smallest Unit Data Size; 32B / die with on-chip ECC for read page and write page No need of complicated F/W for wear leveling thanks to its high reliability Santa Clara, CA 2

B4-Flash brings New Features for Storage NOR N+ N+ P-well NAND N+ N+ P-well Read Speed : Fast Scaling: NG Prog: Slow Erase: Slow Cost: Low Prog: Fast Erase: Fast Read Speed : Slow B4-Flash Memory New Flash Mechanism P+ P+ N-well Read Speed: Fast Scaling: Good Prog: Fast Erase: Fast Reliability: Excellent Santa Clara, CA 3

- B4-Flash achieves 100ns NOR read speed and NAND level re-write speed simultaneously. - Read Latency 100ns can be offered for Storage. 100ns Read Latency can be achieved Read Speed Faster 10us 1us 100ns NOR NAND B4- Flash 0.1MB/s 1MB/s 10MB/s Faster Program Speed For Storage Santa Clara, CA 4 X200 Faster Read

B4-Flash is a Realistically new kind of NVM Vd=0V Back Bias assisted Band to Band tunneling (B4) - Hot Electron Injection 3 Vg=12V Vs=1.8V Vd=0V 1 Vg=12V 2 3 1 Band-to-bandtunneling electron-hole pair generation (GIDL) Vd=0V B4-Flash 58nm 58nm 90nm 90nm P+ P+ 1 2 N-well Vd=0V Vwell=5V 3 Hot Electron Injection Vg=12V SiO2 CG FG Vs=5V 2 Electron Acceleratio n SiO2 barrier Vwell=5V B4-HE injection + Pch MOS transistor memory cell - Fastest Re-writable and Highly Scalable NOR Flash - High Endurance to 100K E/W and Excellent Retention of 20years after 100K E/W World first Lg=58nm 58nm NOR Flash 4Gb/2Gb 1Gb/512Mb Santa Clara, CA 5 256Mb /128Mb B4-Flash achieves the scalable NOR with fast re-write capability and excellent reliability.

B4-Flash can offer New Value for Storage Read Disturb Immunity; 240 billion cycles after 100K E/W Wide Temp. Range Operation -40 125 (Optional) Random Page Program (32B) Low Read Latency; 90ns 32B/page B4-Flash adds New Value for Data Storage in the case of 2Gb (SLC) chip Cycling Endurance more than 100K cycles Excellent Data Retention 125C, 20 years After 100K cycles Santa Clara, CA 6

B4-Flash Memory Main Specifications Power Supply Voltage G28FVW5121S1TE G29FXW002Kx1 G29FXW002Kx1 G29FXW004Kx1 SLC, 1die MLC, 2die SLC, 1die MLC, 1die Memory Density 512Mbit 2Gbit 2Gbit 4Gbit Process Node 90nm 90nm 58nm 58nm I/O x8/x16 parallel x8/x16 parallel x16 parallel x16 parallel Core (VCC) 1.7V~2.0V 1.7V~2.0V 2.7V~3.6V 2.7V~3.6V I/O (VCCQ) 1.7V~2.0V, 2.7V~3.6V 1.7V~2.0V, 2.7V~3.6V 1.65V~Vcc 1.65V~Vcc Operation Temperature -40 ~85-40 ~85-40 ~85-40 ~85 Page Size Page(program) 16W to 512W * 16W to 512W * 16W to 2KW* 16W to 4KW* Block(Erase) 512KW, 2MW** 1MW, 4MW** 1MW, 4MW** 2MW, 8MW** Random Access 115ns 200ns 90ns 110ns Page Read 30ns 30ns 15ns 15ns Page Size 16W 16W 16W 16W Read 565ns 650ns 315ns 335ns 16Word Read 57MB/s 49MB/s 102MB/s 96MB/s Read Current(typ.) 12mA 14mA 10mA 10mA Block Erase Time(typ.) 100ms 100ms 130ms 130ms Page Program Time(typ.) 150us 1000us 210us 1350us Chip Erase Time(typ.) 2 sec** 2sec 5sec** 5sec** Chip Program Time(typ.) 10 sec** 131us** 15sec** 90sec** Chip Erase Speed (typ.) 40MB/s 160MB/s 62MB/s 123MB/s Chip Program Speed (typ.) 7MB/s 2MB/s 19MB/s 6MB/s Standby Standby current(typ.) 370uA 1000uA 100uA 100uA P/E Operation Operation Current(typ.) 35mA 35mA 35mA 35mA Reliability Items Data Retention 20year (Min.) @125C after 100K E/W 20year (Min.) @85C after 1K E/W 20year (Min.) @125C after 100K E/W 20year (Min.) @85C after 1K E/W P/E cycles 100K(Min.) 10K (Min.) 100K (Min.) 100K (Min.) Byte Continuous Read 24 billion after 100K E/W 2.8 billion after 10K E/W B4-Flash 240 billion after 100K E/W 28 billion after 10K E/W Santa Clara, CA 7

58nm 2Gb/4Gb B4-Flash Memory Architecture The current architecture of B4-Flash is based on NOR but it can be easily applied for the Storage 58nm/2Gb SLC 2MB 1 Page (Read) = 32Byte with on-chip ECC 32B 1 Page (Program) = 32B, 64B, 128B, 160B, 192B, 224B, 256B, ----, 1024Byte, ----, 4096Byte adjustable to same page size as read 2MB 2MB 2MB 1 Block (Erase) = 1024Bx2048page = 2MByte 1 Bank = 2MBx32Blocks = 64MByte 1 Chip = 64MBx4Banks = 256MByte 1024B 1024B 1024B 1024B Bank1 Bank2 Bank3 Bank4 58nm/4Gb MLC 2048B 2048B 2048B 2048B 4MB 4MB 4MB 4MB Bank1 Bank2 Bank3 Bank4 32B 1 Page (Read) = 32Byte with on-chip ECC 1 Page (Program) = 32B, 64B, 128B, 160B, 192B, 224B, 256B, ----, 2048Byte, ----, 8192Byte adjustable to same page size as read 1 Block (Erase) = 2048Bx2048page = 4MByte 1 Bank 1 Chip = 4MBx16Blocks = 64MByte = 64MBx4Banks = 512MByte Santa Clara, CA 8

CPU Optimal I/F Controller and I/F Chip Simple Storage Model with B4-Flash x n MUX & I/F chip MUX & I/F MUX & I/F x16bit x m 32B+ECC B4-Flash B4-Flash Unit Data is 32B x16 90ns read latency 315ns for 32B read 1bit on-chip ECC Arbitral 32B write No need of complicated F/W for wear leveling 100K E/W capability No Read Disturb in 240 billion read cycles after 100K E/W Excellent retention internal ECC Santa Clara, CA 9

Each calculation is done using the die performance without any overhead of the system as assumption. Case Study of Simple Storage Model -1- Case 1) n x m =8 chip parallel, System Page Size=256B Read; latency=90ns, throughput=812mb/sec, IOPS=3.2M Case 2) n x m =16 chip parallel, System Page Size=512B Read; latency=90ns, throughput=1.6gb/sec, IOPS=3.2M Case 3) n x m =32 chip parallel, System Page Size=1KB Read; latency=90ns, throughput=3.3gb/sec, IOPS=3.2M Parallel Read Die # (n x m) Read Bus (bit) System Read Page Size (Byte) Read Latency (ns) Read Speed (MB/sec) Read IOPS (K) for each Data Size 1 4 8 16 32 64 16 64 128 256 512 1,024 32 128 256 512 1024 2048 90 90 90 90 90 90 102 406 813 1,625 3,251 6,502 32B 3,175 64B 1,587 128B 794 3,175 256B 397 1,587 3,175 512B 198 794 1,587 3,175 1KB 99 397 794 1,587 3,175 2KB 50 198 397 794 1,587 3,175 Santa Clara, CA 4KB 25 99 198 397 794 1,587 10

Case Study of Simple Storage Model -2- IOPS (K) 3,500 3,000 2,500 2,000 1,500 1,000 500 0 32B 64B 128B According to necessary Small Read Page Size, Best Die Parallelism can be obtained for IOPS >3M with 90ns of read latency. 256B 512B 1KB System Read Page Size 2KB 4KB Parallel Die # 1 die 4 dies 8 dies 16 dies 32 dies 64 dies Highest Read IOPS for each Data Size 7,000 6,000 5,000 4,000 3,000 2,000 1,000 0 1 4 8 16 32 64 Parallel Die # of System for Page Read Santa Clara, CA overhead of the system as assumption. 11 Read Speed (MB/sec) Parallel Die # = n x m Read Throughput with Parallelism Each calculation is done using the die performance without any

64GB Fast Storage 64GB High Performance Data Storage with B4-Flash System Page Size; 512B 2Gb SLC B4-Flash; ((8die MCP x 2chips)x4) x4modules Parallel 256bit bus System I/F Controller and I/F Chip x 4 MUX & I/F chip MUX & I/F x16bit x 4 32B+ECC B4-Flash B4-Flash 2Gb/ 16 dies Expected Performance Read latency; 90ns IOPS of 512B; 3.2M Read throughput; 1.6GB/sec Concept of 64GB B4-Flash Storage Write performance must depend on the controller system architecture itself and it should be decided using the same level of B4-Flash SLC write performance as that of SLC NAND. Each calculation is done using the die performance without any overhead of Santa Clara, CA the system as assumption. 12

TCI (Thru Chip Interconnect) for Higher Speed TCI (Thru Chip Interconnect) will be implemented on each die and eliminate so much busses and wires around dies. It can make higher speed interconnection among memory dies in module. And TCI will bring the benefit for the module design to achieve higher performance of the storage. System I/F Controller and I/F Chip MUX & I/F chip MUX & I/F >10Gbps >10Gbps >10Gbps >10Gbps TCI 32B+ECC High Speed B4-Flash Module High Speed B4-Flash Module 2Gb/ 16 dies Santa Clara, CA 13

StorageB4 Main Specifications (including small sector types) Power Supply Voltage Page Size Read Sector Erase Page Program Chip Speed Reliability Items Memory Density Process Node I/O Operation Temperature G29FXW002Kx1 G29FXW004Kx1 SLC, 1die MLC, 1die SLC, 1die MLC, 1die 2Gbit 4Gbit 2Gbit 4Gbit 58nm 58nm 58nm 58nm x16 parallel x16 parallel x16 parallel x16 parallel Core (VCC) 2.7V~3.6V 2.7V~3.6V 2.7V~3.6V 2.7V~3.6V I/O (VCCQ) 1.65V~Vcc 1.65V~Vcc 1.65V~Vcc 1.65V~Vcc -40 ~85-40 ~85-40 ~85-40 ~85 Page(program) 16W to 2KW* 16W to 4KW* 16W to 64W 16W to 128W Sector (Erase) 1MW, 4MW** 2MW, 8MW** 32KW 64KW Random Access 90ns 110ns 180ns 200ns Page Read 15ns 15ns 20ns 20ns Page Size 16W 16W 16W 16W 16Word Read 315ns 335ns 480ns 500ns 102MB/s 96MB/s 67MB/s 64MB/s Sector Erase Time(typ.) 130ms 130ms 2ms 2ms Block Erase (x16 Sector) Time(typ.) - - 6ms 6ms Back Ground Erase No No Yes Yes Page Program Time(typ.) 210us 1350us 150us 1000us 16 Page Program Time(typ.) - - 300us 2000us Erase Time(typ.) 5sec (4Bank) 5sec (4Bank) 7.7sec (16sector) 7.7sec(16 sector) Program Time(typ.) 15sec (4Bank) 90sec (4Bank) 39sec (16sector) 262sec (16sector) Erase Speed (typ.) 62MB/s 123MB/s 333MB/s 333MB/s Program Speed (typ.) 19MB/s 6MB/s 7MB/s 2MB/s Data Retention NOR base Current B4-Flash products 20year (Min.) @125C after 100K E/W Type A 20year (Min.) @85C after 1K E/W StorageB4 (Storage ApplicationDedicated) (Under Planning) Under Estimation Type B Under Estimation P/E cycles 100K (Min.) 100K (Min.) 100K (Min.) 100K (Min.) 240 billion 28 billion 24 billion 2.8 billion Byte Continuous Read Santa Clara, CA after 100K E/W after 10K E/W after 100K E/W after 10K E/W 14

StorageB4 2Gb/4Gb Architecture (small sector) Storage B4 will be developed as Fast Storage Application dedicated Memory with B4-Flash architecture. 58nm/2Gb SLC 64KB 64KB 64KB 64KB 1024B 1MB - - - - - - - - - - - - 64KB 64KB 32B 1 Page (Read) = 32Byte with on-chip ECC 1 Page (Program) = 32B, 64B, 96B, 128B, ----, 256B, 512B, ----, 1024Byte, ----, 2048Byte 1 Sector (Erase) adjustable to same page size as read = 128Bx512page = 64KByte 1 Block (Erase) = 64KBx16sector = 1MByte 1 Chip = 1MBx256blocks = 256MByte 58nm/4Gb MLC 128KB 128KB 128KB 128KB 2048B 2MB - - - - - - - - - - - - 128KB 128KB 32B 1 Page (Read) = 32Byte with on-chip ECC 1 Page (Program) = 32B, 64B, 96B, 128B, ----, 256B, ----, 512B, ----, 4096Byte adjustable to same page size as read 1 Sector (Erase) = 256Bx512page = 128KByte 1 Block (Erase) = 128KBx16sector = 2MByte 1 Chip = 2MBx256blocks = 512MByte Santa Clara, CA 15

Higher Density with Low Cost can be achieved 256Gb How to achieve Higher Density B4-Flash Scaling Down to 20nm MLC capability - MLC B4-Flash has been realized in 90nm/58nm process - 4F2 Cell with MLC - Smaller than DRAM (6F2) - MLC may require a little precise F/W of Wear Leveling for write - Read performance must be almost same as that of SLC Virtual Grand Array (VGA)will be the challenge for 2F2 MLC same as 2D NAND in future Santa Clara, CA Memory Density Cell Area (um2) 64Gb 16Gb 4Gb 1Gb 0.1 0.01 0.001 Floating Gate MLC SLC 1Gb MLC 512MbSLC In Production 90nm 90nm 4Gb MLC 2Gb SLC 58nm 0.065um2 0.027um2 Going to Production in 2015 58nm 45nm In development FG Flash 0.008um2 45nm 128Gb MLC VGA 64Gb MLC VGA 32Gb MLC 16Gb MLC 16Gb SLC 8Gb SLC 0.0036um2 30nm Contact less cell (Optional) 30nm 22nm 0.007um2 0.0046um2 22nm Tech. Node Tech. Node 16

B4-Flash for Storage Class Memory Application B4-Storage CPU SRAM DRAM Tier 0 NAND SSD HDD Big Latency Gap Real Solution CPU SRAM DRAM Tier 0 B4-Flash NAND SSD HDD <Read> 100ns Read 32B-512B page IOPS >1M Throughput >1GB/s <Write> Write features can be designed with the controller like that of NAND 17

Creating New Flash Arena Density Conventional NOR NAND B4-Flash Arena High End NOR Year 18

x10 of cost performance - B4 Storage can offer x100 Speed with more cost. - We hope there exist the applications requiring x10 cost performance HDD Cost Low 0.1 Comparison of Cost/Performance Slow Speed 0.001 0.01 0.1 NAND SSD 10 100 1000 Fast Speed 10 B4 Storage Santa Clara, CA Cost High 19

If you have any kind of interest in B4 Storage, please let us know as; Flash Memory Summit Booth; #536 E-mail; fms2015@genusion.co.jp Web.; http://www.genusion.co.jp/index_e.html Thank you for your attention! GENUSION exhibiting at FMS 2015 is partially supported by Japan External Trade Organization (JETRO). Santa Clara, CA 20