Architecture for Carbon Nanotube Based Memory () Bill Gervasi Principal Systems Architect 18 August 2018
Agenda 2 Carbon nanotube basics Making & breaking connections Resistive measurements Write endurance, Timing, & Temperature When fades away The universe of Storage Class Memories : Memory Class Storage Standard modules using Industry readiness for persistent memory
Disclaimer 3 Nantero is a technology development and intellectual property licensing company This presentation covers the technology we develop and license Details shown apply to a specific reference design Specific product details and introduction dates relate to availability of the technology Customers & partners control actual product details and dates We d be thrilled to license to YOU, too
CNT Nonvolatile Memory 4 ELECTRODE ELECTRODE Van der Waals effect keeps CNTs apart or together Data retention >300 years @ 300 C (more likely >1000 years) Stochastic array of many nanotubes per each cell
No Dielectric No Known Failure Mechanism 5 TOP METAL CNTs switch in a void No dielectric Wear-out has not been observed BOTTOM ELECTRODE Unlimited write endurance expected
Resistance Measurements, 0 and 1 6 20 15 Count 10 One representative 15 x 15 nm cell shown 5 0 ON OFF 100k 1M 10M 100M 1G Resistance (Ohm) Voltage Greater than 10X difference between 0 and 1 No calibration required across the wafer Smooth SET curve: MLC has been tested as well
Pulse Width Timing, Reset 0, Set 1 7 Cumulative sampling Consistent operation from 40 ns down to 5 ns read/write per cell And very repeatable
No Apparent Temperature Sensitivity 8 Similar set & reset curves at any temperature CNT operation and retention seen at 300 C, > 300 years Limited by underlying silicon circuit reliability
Capacity Scaling 9.. Add layers of CNTs Example: 4Gb/layer w/ 28nm logic Logic/Memory process Substrate Die stacking using standard TSV Multi-level cell as a function of pulse Process agnostic Can be built on top of memory or logic processes Note: Reference design detail; may vary by customer Process scaling is a function of #CNTs per bit well understood < 5nm
DDR Scalability 512 Gb 7 nm 8 layers CNT 10 ~100 mm 2 Design Ready 16 Gb 28 nm logic 4 layers CNT 8 Gb 28 nm logic 2 layers CNT Add layers New process Add layers 128 Gb 14 nm 8 layers CNT 64 Gb 14 nm 4 layers CNT New process DDR4 8-die stacks Note: Reference design detail; may vary by customer 256 Gb 7 nm 4 layers CNT Add layers DDR5 16-die stacks
is a Memory Class Storage 11 Hard Disk SSD NVMe Wasteland Flash Storage Class Memory 3D NOR Phase Change 3D Xpoint Resistive Magnetic Memory Class Storage > performance = endurance > capacity < price DDR Painfully slow Lotsa cheap bits Low endurance Moderate speed Moderate endurance Capacity range DDR
DDR4 Reference Internal Architecture 12 Bank Group 3 Bank Group 2 Bank Group 1 Bank Group 0 Bank & Row Address Latches Bank Decode Bank Bank Bank Bank BG[1:0] BA[1:0] A[16:0] Address Registers Column Address Latches Latching Sense Amp Latching Sense Amp Latching Sense Amp Latching Sense Amp C[2:0] CS_n WE_n CAS_n RAS_n CK_t CK_c CKE Chip ID Registers Command Decoder Clock Control Mapping CAM Built-In Self Test CA Parity, Data CRC Column Decode DLL SECDED ECC Engine Column Decode I/O Multiplexer: 64 + 8 ECC Write Data Latches Multi Purpose Registers Column Decode Read Data Latches Column Decode RESET_n Control Logic Strobe Generator I/O Drivers & Control ODT, Vref Training, ZQ Calibration Value Added Functions DQS DQ I ODT Standard DDR Functions CNT Specific Functions Note: Reference design detail; may vary by customer
Crosspoint Tiles 13 H1<0> H2<0> V1<0> V2<0> V1<1> V2<1> V1<2> V2<2> V1<3> V2<3> V1<4> V2<4> V1<5> V2<5> V1<6> V2<6> V1<7> V2<7> H1<1> H2<1> H1<2> H2<2> H1<3> H2<3> Select Decode Logic Tile = 64 x 256 x 4 Latching sense amplification Note: Reference design detail; may vary by customer
Timing Impact of On-the-fly ECC 14 Carbon Nanotube Arrays Timing Row cycle DDR4 S ns 47.00 DDR4 ns 46.25 DDR5 S ns 50.18 72 bits Access time 17.14 13.50 18.18 ECC Engine Row to column 15.00 23.00 18.18 64 bits Precharge 15.00 14.25 18.18 FIFO Write recovery 15.00 23.00 45.00 Activate to precharge 32.00 32.00 32.00 Data Strobes Refresh 350.00 0 350.00 Note: Reference design detail; may vary by customer
Bus Efficiency Comparison at Same Frequency 15 DDR4/DDR5 ~15% Base throughput Elimination of refresh Elimination of ACTIVATE restrictions Elimination of bank group restrictions Elimination of power states
128 GB LRDIMM or RDIMM 16 RCD SPD SPD taa trcd trp etc Host CPU Fully Deterministic DDR Memory interface Host reads SPD, configures memory interface per settings
For Comparison, Industry NVDIMM-N 17 Flash backup for External energy source & regulation Half Half Flash Flash Flash Power Supply NVC
For Comparison, Industry NVDIMM-P 18 Non-deterministic protocol needed because other NVMs have wear out and need leveling External energy NVDIMM-P Many need cache Controller allocates credits for all R/W Power Supply NVM NVM NVM NVM NVM NVC NVM NVM NVM NVM NVM All data buffered; all signals flow to centralized controller
Power Fail Comparison 19 Power fail Power restore NVDIMM-N, -P Complete burst in process Switch power to battery Save all pending operations Copy to NVM A minute or more Check save status Copy NVM to Run Complete burst in process Run Module
Software Increasingly Persistence Aware 20 Windows, Linux exploit persistent memory
Summary 21 Electrostatic effects set & reset each bit Resistance delta of 10X allows reliable sensing Dielectric-free cell shows no wear-out DDR4 includes a -compatible front end Defines a new category Memory Class Storage per die capacity scales far beyond Fully deterministic timing better than a On-the-fly ECC incorporated for server class reliability Module level products are plug and play compatible Industry is ready for persistent main memory
22 Thank you for your time Bill Gervasi bilge@nantero.com