Intelligent Hybrid Flash Management

Size: px

Start display at page:

Download "Intelligent Hybrid Flash Management"

Clementine Booth
5 years ago
Views:

1 Intelligent Hybrid Flash Management Jérôme Gaysse Senior Technology&Market Analyst Flash Memory Summit 2018 Santa Clara, CA 1

2 Research context Analysis of system & application Performance modeling Emerging technologies analysis New architecture definition Performance simulation Santa Clara, CA 2

3 Deep learning Inference and training Training value = weights value Training problem Take a long time: time to market impact Expansive hardware resources : TCO impact Santa Clara, CA 3

4 Training process For each image: Forward Backward 1) Inference calculation 2) Error calculation 3) Weights update Data set (100k to 1M images) Batch (128 images) Training system 80% training 20% verification All images = 1 epoch Training = multiple epochs Santa Clara, CA 4

5 Deep learning training System architecture SSD GPU HBM All flash array NIC CPU SSD GPU HBM Santa Clara, CA 5

6 Performance analysis Focus on ResNet50 neural network 50 layers 25M parameters 3.9GMACs Many benchmarks available Nvidia, HPE, Dell Santa Clara, CA 6

7 Performance analysis Throughput : 400 images per second/gpu FP32 resolution 25M parameters => 100MB model size Checkpointing (about every 400 images) =>100MB/s write Data set : 100kB => 40MB/s read No IO storage bottleneck Santa Clara, CA 7

8 DL improvements From FP32 to FP16 to INT8 Less computing requirements Less memory bandwidth requirements Pruning (less connexions between neurons) Less computing requirements Less memory bandwidth requirements Training throughput to increase Santa Clara, CA 8

9 Training performance increase With DL optimization, and new Deep Learning Processor developement From 400 FPS to 10,000FPS (estimation) x25 Santa Clara, CA 9

10 IO impact If throughtput x25 Read => 1GB/s (x25) Write => 1GB/s (x10) x25 accesses but less data to write Santa Clara, CA 10

11 Need new architecture All flash array 8 GB/s 8 GB/s Training system Huge data movement between training system and storage array => High power consumption => High silicon cost (AFA controller, NIC ) Santa Clara, CA 11

12 Computational storage concept Computational storage = computing capabilities in the SSD CPU SSD controller + NandFlash + Local computing Reduce data movement: Less power, higher performance Santa Clara, CA 12

13 Computational storage architecture Computational storage controller NV Cache controller NV Cache DNN parameters checkpointing NVMe Flash controller Flash 1 Flash 2 Data set DNN parameters Hyper parallel processor Santa Clara, CA 13

14 Theory of operation Namespaces Weights Data set Vendor specific commands Start training Santa Clara, CA 14

15 DNN weights storage, why? NV Cache could be enough Only few dozens of MB: small storage Another reason? Yes, for weight convergence analysis Flash Memory Summit 2018 Santa Clara, CA 15

16 Multi Flash requirements Dataset storage Capacity: 100G-1TB range Read only : 1GB/s Weight storage Capacity : 5TB Write only: 1GB/s Santa Clara, CA 16

17 Flash options Criteria Performance, cost, software simplicity Data set and weights on the same media Data set and weights on 2 media Data set and weights on flash and ssd Flash Flash Flash Flash M.2 Flash Memory Summit 2018 Santa Clara, CA 17

18 Embedded processing KeyValueStore: for data set manegement Preprocessing DL Santa Clara, CA 18

19 System benefit SSD All flash array NIC CPU GPU HBM Removing hardware cost & power CPU Deep Learning Processor Flash Computational storage Santa Clara, CA 19

20 Next research steps Simulation at system level Detailed storage access (latency) Computational storage applied to NVDIMM? Benefits of new interconnects? GenZ, OpenCapi, CCIX Santa Clara, CA 20

21 Want to know more? A Comparison of In-storage Processing Architectures and Technologies Monday Sept 24, 10:35 AM - 11:25 AM Santa Clara, CA 21

Using MRAM to Create Intelligent SSDs

Using MRAM to Create Intelligent SSDs Jérôme Gaysse Senior Technology&Market Analyst jerome.gaysse@silinnov-consulting.com Santa Clara, CA 1 Study context Analysis of system & application Performance modeling