Building the Software 2.0 Stack. Andrej Karpathy May 10, 2018

Size: px

Start display at page:

Download "Building the Software 2.0 Stack. Andrej Karpathy May 10, 2018"

Joel Hart
6 years ago
Views:

1 Building the Software 2.0 Stack Andrej Karpathy May 10, 2018

2 1M years ago

Engineering: approach by decomposition AWS

9 Engineering: approach by decomposition AWS stack 1. Identify a problem 2. Break down a big problem to smaller problems 3. Design algorithms for each individual problem 4. Compose solutions into a system (get a stack ) TCP/IP stack Android software stack

10 We got surprisingly far...

11 What is the recognition stack? cat

12 Visual Recognition: 1980 ~ 1990 David Marr, Vision

14 Visual Recognition: ~ vector describing various image statistics Feature Extraction f training 1000 numbers, indicating class scores

15 Computer Vision 2011 Page 1

16 Computer Vision 2011 Page 2

17 Computer Vision code complexity :( Page 3

18 vector describing various image statistics Feature Extraction f training 1000 numbers, indicating class scores f training 1000 numbers, indicating class scores

19 NEURAL ARCHITECTURE SEARCH WITH REINFORCEMENT LEARNING, Zoph & Le Large-Scale Evolution of Image Classifiers Real et al.

20 scale: Datasets & Compute Google/FB Images on the web (~10^9+ images) In Computer Vision... Zone of not going to happen. Top performing models ImageNet (~10^6 images) Pascal VOC (~10^5 images) Caltech 101 (~10^4 images) Lena (10^0; single image) Hard Coded (edge detection etc. no learning) Image Features (SIFT etc., learning linear classifiers on top) ConvNets (learn the features, Structure hard-coded) CodeGen (learn the weights and the structure) models

21 Software 1.0 Written in code (C++, ) Requires domain expertise 1. Decompose the problem 2. Design algorithms 3. Compose into a system Measure performance

22 Software 2.0 Fill in the blanks programming Requires much less domain expertise 1. Design a code skeleton 2. Measure performance (automate)

23 Program space Software 1.0

24 Program space Software 1.0 Software 2.0

25 Program space Software 1.0 Software 2.0

28 One Model To Learn Them All single model is trained concurrently on ImageNet, multiple translation tasks, image captioning (COCO dataset), a speech recognition corpus, and an English parsing task

29 (no need for datasets necessarily)

30 Other example members of the transition... STOCHASTIC PROGRAM OPTIMIZATION FOR x86 64 BINARIES PhD thesis of Eric Schkufza, 2015

31 Robotics Google robot arm farm Neural Net: Image to torques

32 *ASTERISK :) Fully Software 1.0

33 1.0 Software 1.0 is not going anywhere W

34 1.0 Software 1.0 is not going anywhere W deployment package W

35 The benefits of Software 2.0

36 Computationally homogeneous

37 Hardware-friendly

38 Constant running time and memory use vs.

39 Agile I d like code with the same functionality but I d like it to run faster, even if it means slightly worse results vs.

40 Finetuning vs.

41 It works very well. DL (slide from Kaiming He s recent presentation)

42 Largest deployment of robots in the world (0.25M) Make them autonomous.

43 steering & acceleration 1.0 code 2.0 code 8 cameras ultrasonics radar IMU

44 steering & acceleration 8 cameras ultrasonics radar IMU

45 steering & acceleration 8 cameras ultrasonics radar IMU

46 Example: parked cars car car car car Parked if: Tracked bounding box does not move more than 20 pixels over last 3 seconds AND is in a neighboring lane, AND... (brittle rules on highly abstracted representation)

47 Example: parked cars car car car car Car parked. Parked if: Tracked bounding box does not move more than 20 pixels over last 3 seconds AND is in a neighboring lane, AND... Parked if: Neural network says so, based on a lot of labeled data. (brittle rules on highly abstracted representation)

48 Programming with the 2.0 stack

49 If optimization is doing most of the coding, what are the humans doing? - Design and develop cool algorithms - Analyze running times

50 If optimization is doing most of the coding, what are the humans doing? - Design and develop cool algorithms - Analyze running times 1. Label

51 If optimization is doing most of the coding, what are the humans doing? - Design and develop cool algorithms - Analyze running times 1. Label 2. Maintain surrounding dataset infrastructure - Flag labeler disagreements, keep stats on labelers, escalation features - Identify interesting data to label - Clean existing data - Visualize datasets

52 Amount of lost sleep over... PhD Tesla

53 Lesson learned the hard way #1: Data labeling is highly non-trivial

54 Label lane lines

55 Label lane lines

56 Label lane lines

57 (Philosophical conundrums) How do you annotate lane lines when they do this?

58 Label lane lines

59 Label lane lines

60 (Philosophical conundrums) Is that one car, four cars, two cars?

61 (Philosophical conundrums)

62 (Philosophical conundrums)

63 Lesson learned the hard way #2: Chasing Label/Data Imbalances is non-trivial

64 car trolley 90% of all vehicles 1e-3% of all vehicles

65 10% of all signs 1e-4% of all signs

66 Right blinker on

67 Orange traffic light

68 90%+ of data

69 1e-3% of data

70 1e-3% of data

71 Lesson learned the hard way #3: Labeling is an iterative process

72 1. Collect labels 2. Train a model 3. Deploy the model Example: Autowiper

73 1. Collect labels 2. Train a model 3. Deploy the model Example: Autowiper

76 Lesson learned the hard way overall: The toolchain for the 2.0 stack does not yet exist. (and few people realize it s a thing)

77 1.0 IDEs

78 2.0 IDEs???

2.0 IDEs - Show a full inventory/stats of the current dataset - Create / edit annotation layers for any datapoint - Flag, escalate & resolve discrepancies in multiple labels

79 2.0 IDEs - Show a full inventory/stats of the current dataset - Create / edit annotation layers for any datapoint - Flag, escalate & resolve discrepancies in multiple labels - Flag & escalate datapoints that are likely to be mislabeled - Display predictions on an arbitrary set of test datapoints - Autosuggest datapoints that should be labeled -...

82 The sky's the limit

83 Thank you!

CS231N Section. Video Understanding 6/1/2018

CS231N Section. Video Understanding 6/1/2018 CS231N Section Video Understanding 6/1/2018 Outline Background / Motivation / History Video Datasets Models Pre-deep learning CNN + RNN 3D convolution Two-stream What we ve seen in class so far... Image