Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers

Size: px

Start display at page:

Download "Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers"

Eileen Burke
5 years ago
Views:

Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Johann Hauswald, Michael A.

1 Sirius: An Open End-to-End Voice and Vision Personal Assistant and Its Implications for Future Warehouse Scale Computers Johann Hauswald, Michael A. Laurenzano, Yunqi Zhang, Cheng Li, Austin Rovinski, Arjun Khurana, Ron Dreslinski, Trevor Mudge, Vinicius Petrucci, Lingjia Tang, Jason Mars University of Michigan Ann Arbor, MI

2 Intelligent Personal Assistants (IPAs) 2

3 Rise of the Wearables 40% $80bn 3

4 Scaling Current Datacenters Compute Resources % 50% 100% Ratio of IPA to Web Search Queries 4

5 Scaling Current Datacenters 10% IPA: 16x Machines Compute Resources % 50% 100% Ratio of IPA to Web Search Queries 4

6 Scaling Current Datacenters Compute Resources % IPA: 80x Machines 1 0% 50% 100% Ratio of IPA to Web Search Queries 4

7 Scaling Current Datacenters Compute Resources % 50% 100% Ratio of IPA to Web Search Queries 100% IPA: 160x Machines 4

8 The Challenge Redesign the datacenter for intelligent personal assistants No Open Source IPA Investigate Future Datacenters Designs 5

9 Open Source Intelligent Personal Assistant Benchmark Suite Ported Suite Across Accelerator Platforms Investigate Future Datacenter Designs For IPAs 6

10 Sirius 7

11 Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

12 Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

13 Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering Search Database Execute Server Mobile Users Image Database Image Data Question Image Set my alarm for 6am 8

14 What is the capital of Turkey? Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

15 Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering What is the capital of Turkey? Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

16 Ankara Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering What is the capital of Turkey? Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

17 Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering What is the capital of Turkey? Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

18 How tall is the eiffel tower? Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering What is the capital of Turkey? Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

19 Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering What is the capital of Turkey? How tall is the eiffel tower? Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

20 300 meters Answer Display Answer Image Matching Voice Automatic Speech-Recognition Query Classifier Question or Question-Answering What is the capital of Turkey? How tall is the eiffel tower? Search Database Execute Server Mobile Users Image Database Image Data Question Image 8

21 Answer Display Answer Image Matching Voice Server Mobile Users Image Database Automatic Speech-Recognition Query Classifier Execute Question-Answering What is the capital of Turkey? How tall is the eiffel tower? Search Database Question or Image Data Question Image Sirius: full end-to-end with inputs, pre-trained models, and databases Sirius-suite: 7 kernels with inputs to study each service individually sirius.clarity-lab.org 8

Answer Display Answer Image Matching Voice Server Mobile Users Image Database Automatic Speech-Recognition Query Classifier Execute Question-Answering What is the capital of Turkey?

22 Answer Display Answer Image Matching Voice Server Mobile Users Image Database Automatic Speech-Recognition Query Classifier Execute Question-Answering What is the capital of Turkey? How tall is the eiffel tower? Search Database Question or Image Data Question Image Sirius: full end-to-end with inputs, pre-trained models, and databases Sirius-suite: 7 kernels with inputs to study each service individually sirius.clarity-lab.org 8

23 How does Sirius work? Users Voice Command (VC) Automatic-Speech Recognition (ASR) Voice Query (VQ) Voice-Image Query (VIQ) Question Answering (QA) Image Matching (IMM) CMU Sphinx Signal Processing Query Taxonomy IPA Services Open Source Tools Natural Language Processing Image Processing Tasks 11

24 Sirius-suite 12

25 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services 13

26 Sirius-suite Automatic-Speech Recognition (ASR) Gaussian Mixture Model Question Answering (QA) Image Matching (IMM) IPA Services Deep Neural Network 13

27 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services GMM (85%) DNN (78%) 13

28 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services GMM (85%) DNN (78%) Conditional Random Fields 13

29 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services GMM (85%) DNN (78%) Stemmer (46%) Regex (22%) CRF (17%) 13

30 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services GMM (85%) DNN (78%) Stemmer (46%) Regex (22%) CRF (17%) Feature Extraction Feature Description 13

31 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services GMM (85%) DNN (78%) Stemmer (46%) Regex (22%) FE (41%) FD (56%) CRF (17%) 13

32 Sirius-suite Automatic-Speech Recognition (ASR) Question Answering (QA) Image Matching (IMM) IPA Services GMM (85%) DNN (78%) Stemmer (46%) Regex (22%) FE (41%) FD (56%) CRF (17%) 7 kernels: 92% total execution of Sirius Suite entirely written in C/C++/CUDA Release includes inputs and models 13

33 Future Datacenter Design 15

34 How must current datacenters be upgraded to meet demand? What is the efficiency of the upgraded datacenter? 16

35 Upgrading Datacenters with COTS Systems Platform Model Clock Threads Multicore CPU Intel Xeon E V GHz 8 GPU NVIDIA GTX GHz Intel Phi Phi 5110P 1.05 GHz 240 FPGA Xilinx Virtex-6 ML MHz N/A 17

36 Upgrading Datacenters with COTS Systems Platform Advantage Disadvantage Multicore CPU Minor SW changes Limited speedup GPU Many threads Programability Intel Phi Manycore Limited compiler support FPGA Flexible New implementation 18

37 Acceleration Overview Platform GMM DNN Stemmer Regex CRF FE FD CMP GPU * 3.8* Intel Phi FPGA * * 7.5* 34.6* 75.5* 19

38 Acceleration Overview Platform GMM DNN Stemmer Regex CRF FE FD CMP Custom Porting: GPU * 3.8* % of the Implementations Intel Phi FPGA * * 7.5* 34.6* 75.5* 19

39 Acceleration Results Speedup 20

40 Acceleration Results Speedup Speech Recognition Question Answering Image Matching 20

41 Acceleration Results ~6x Speedup ~5x Speech Recognition Question Answering Image Matching 20

42 Acceleration Results Speedup ~52x Speech Recognition Question Answering Image Matching 120x 20

43 Acceleration Results 169x ~99x Speedup ~52x Speech Recognition Question Answering Image Matching 120x 20

44 Service Latency Improvement Platform Latency (s) 21

45 Service Latency Improvement 2.8s Platform Latency (s) 21

46 Service Latency Improvement 2.8s Platform Using 8 threads Latency (s) 21

47 Service Latency Improvement 21

48 Service Latency Improvement Average Latency Reduction: FPGA: 16x GPU: 10x 21

49 Performance improvements increase throughput Reduce the number of servers 22

50 Performance improvements increase throughput Reduce the number of servers What is the Total Cost of Ownership of an accelerator upgraded Datacenter? 22

51 TCO Model Parameters [1] Parameter Value Server Price $2,102 Server Power 164 W PUE 1.1 DC Depreciation 12 years Server Depreciation 3 years Average Server Utilization 45% Electricity Cost 0.067/kWh Datacenter Price $10/W Datacenter Opex $0.04/W Server Opex 5% of Capex/year [1] Barroso, Luiz André, et. al. "The datacenter as a computer: An introduction to the design of warehouse-scale machines." 23

52 TCO Query Level Results Improvement 24

53 TCO Query Level Results Improvement Average TCO improvement: GPU: 2.6x FPGA: 1.4x 24

54 Other topics included in the paper: Real System Analysis Question Variability Analysis Accelerator Porting Methodology FPGA Implementation Accelerator Details Performance per Watt Throughput Improvement at Various Load Levels Homogeneous/Heterogenous Datacenter Design 26

55 Sirius: full application Sirius-suite: 7 kernels to study each service sirius.clarity-lab.org 27

56 Thank you 28

Winter 2018 Prof. Satish Narayanasamy Special thanks to Babak Falsafi (EPFL) for ecocloud slides

EECS 570 Applications Winter 2018 Prof. Satish Narayanasamy http://www.eecs.umich.edu/courses/eecs570/ Special thanks to Babak Falsafi (EPFL) for ecocloud slides Slides developed in part by Profs. Falsafi,