Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms

Similar documents
Big data systems 12/8/17

Research challenges in data-intensive computing The Stratosphere Project Apache Flink

Processing of big data with Apache Spark

Practical Big Data Processing An Overview of Apache Flink

Apache Flink- A System for Batch and Realtime Stream Processing

Apache Flink. Fuchkina Ekaterina with Material from Andreas Kunft -TU Berlin / DIMA; dataartisans slides

CSE 444: Database Internals. Lecture 23 Spark

The Evolution of Big Data Platforms and Data Science

Overview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::

Apache Flink Big Data Stream Processing

An Introduction to Apache Spark

Shark. Hive on Spark. Cliff Engle, Antonio Lupher, Reynold Xin, Matei Zaharia, Michael Franklin, Ion Stoica, Scott Shenker

Lecture 11 Hadoop & Spark

Chapter 4: Apache Spark

15.1 Data flow vs. traditional network programming

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Lecture Notes to Big Data Management and Analytics Winter Term 2017/2018 Apache Flink

Dynamic Resource Allocation for Distributed Dataflows. Lauritz Thamsen Technische Universität Berlin

Unifying Big Data Workloads in Apache Spark

Apache Spark is a fast and general-purpose engine for large-scale data processing Spark aims at achieving the following goals in the Big data context

Scaled Machine Learning at Matroid

Spark, Shark and Spark Streaming Introduction

Machine learning library for Apache Flink

Batch Processing Basic architecture

Spark Overview. Professor Sasu Tarkoma.

Programming Systems for Big Data

Apache Spark 2.0. Matei

Linear Regression Optimization

We consider the general additive objective function that we saw in previous lectures: n F (w; x i, y i ) i=1

Apache Storm: Hands-on Session A.A. 2016/17

Pouya Kousha Fall 2018 CSE 5194 Prof. DK Panda

An Introduction to Big Data Analysis using Spark

Kafka Streams: Hands-on Session A.A. 2017/18

Architecture of Flink's Streaming Runtime. Robert

Apache Spark and Scala Certification Training

Simplifying ML Workflows with Apache Beam & TensorFlow Extended

HiTune. Dataflow-Based Performance Analysis for Big Data Cloud

IBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics

Big Data Infrastructures & Technologies

Distributed Computing with Spark

Spark. In- Memory Cluster Computing for Iterative and Interactive Applications

Specialist ICT Learning

Distributed Machine Learning" on Spark

Distributed Computing with Spark and MapReduce

Matrix Computations and " Neural Networks in Spark

Putting it together. Data-Parallel Computation. Ex: Word count using partial aggregation. Big Data Processing. COS 418: Distributed Systems Lecture 21

MapReduce Spark. Some slides are adapted from those of Jeff Dean and Matei Zaharia

2/4/2019 Week 3- A Sangmi Lee Pallickara

Blended Learning Outline: Developer Training for Apache Spark and Hadoop (180404a)

DATA SCIENCE USING SPARK: AN INTRODUCTION

MLI - An API for Distributed Machine Learning. Sarang Dev

Machine Learning In A Snap. Thomas Parnell Research Staff Member IBM Research - Zurich

Apache Flink. Alessandro Margara

2/26/2017. Originally developed at the University of California - Berkeley's AMPLab

In-memory data pipeline and warehouse at scale using Spark, Spark SQL, Tachyon and Parquet

Using Numerical Libraries on Spark

MDHIM: A Parallel Key/Value Store Framework for HPC

Asanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks

Spark and distributed data processing

MapReduce review. Spark and distributed data processing. Who am I? Today s Talk. Reynold Xin

COMP6237 Data Mining Data Mining & Machine Learning with Big Data. Jonathon Hare

MIT805 BIG DATA MAPREDUCE

Analytics in Spark. Yanlei Diao Tim Hunter. Slides Courtesy of Ion Stoica, Matei Zaharia and Brooke Wenig

Spark Over RDMA: Accelerate Big Data SC Asia 2018 Ido Shamay Mellanox Technologies

MapReduce, Hadoop and Spark. Bompotas Agorakis

Apache SystemML Declarative Machine Learning

An Introduction to Apache Spark Big Data Madison: 29 July William Red Hat, Inc.

Resilient Distributed Datasets

Apache Flink: Distributed Stream Data Processing

Track Join. Distributed Joins with Minimal Network Traffic. Orestis Polychroniou! Rajkumar Sen! Kenneth A. Ross

Apache Ignite - Using a Memory Grid for Heterogeneous Computation Frameworks A Use Case Guided Explanation. Chris Herrera Hashmap

CS435 Introduction to Big Data FALL 2018 Colorado State University. 10/24/2018 Week 10-B Sangmi Lee Pallickara

The Stream Processor as a Database. Ufuk

Introduction to Hadoop. Owen O Malley Yahoo!, Grid Team

Shark: SQL and Rich Analytics at Scale. Reynold Xin UC Berkeley

CS Spark. Slides from Matei Zaharia and Databricks

Accelerate Big Data Insights

An Introduction to The Beam Model

A Parallel R Framework

WHITE PAPER. Apache Spark: RDD, DataFrame and Dataset. API comparison and Performance Benchmark in Spark 2.1 and Spark 1.6.3

CS455: Introduction to Distributed Systems [Spring 2018] Dept. Of Computer Science, Colorado State University

MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS

Analytic Cloud with. Shelly Garion. IBM Research -- Haifa IBM Corporation

TOWARDS PORTABILITY AND BEYOND. Maximilian maximilianmichels.com DATA PROCESSING WITH APACHE BEAM

Bringing Data to Life

Shark: Hive (SQL) on Spark

SparkBench: A Comprehensive Spark Benchmarking Suite Characterizing In-memory Data Analytics

Introduction to Apache Beam

microsoft

Harp-DAAL for High Performance Big Data Computing

Flash Storage Complementing a Data Lake for Real-Time Insight

Functional Comparison and Performance Evaluation. Huafeng Wang Tianlun Zhang Wei Mao 2016/11/14

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Microsoft Azure Databricks for data engineering. Building production data pipelines with Apache Spark in the cloud

Activator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.

Distributed Optimization for Machine Learning

Introduction to MapReduce (cont.)

CSC 261/461 Database Systems Lecture 24. Spring 2017 MW 3:25 pm 4:40 pm January 18 May 3 Dewey 1101

Apache Beam. Modèle de programmation unifié pour Big Data

Optimizing Across Relational and Linear Algebra in Parallel Analytics Pipelines

Transcription:

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Candidate Andrea Spina Advisor Prof. Sonia Bergamaschi Co-Advisor Dr. Tilmann Rabl Co-Advisor Christoph Boden Dipartimento di Ingegneria Enzo Ferrari - Laurea Magistrale di Ingegneria Informatica - MODENA

Where, When, How and Why Berlin, DE 5 months Traineeship - MAY - OCT 16 Database Systems and Information Management Group, Technische Universität Team Project - Systems Performance Research Unit 2

Agenda Background Experiments Definition Benchmarking and Results Analysis Insights by Results Agenda 3

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Background

Why Distributed Machine Learning? Mutliple Sources Background MapReduce Iterative Algorithms 5

Because represents innovation SOLUTION Mutliple Sources Iterative Algorithms Massive DataFlow Engines Background 6

One of the goals - fairness give code open-source keep jobs reproducible make benchmark exhaustive model systems as same as possible Background 7

Another Goal - include more and more systems Background 8

My Goal - OK, may I start with a couple of those? Background 9

Apache Flink vs. Apache Spark Apache Flink Apache Spark THE GOAL - Benchmark on Performance and Scalability Background 10

Systems similarities - they are stacks Background 11

Systems similarities - they do batch and streaming JAVA Background SCALA JAVA SCALA PYTHON R PYTHON 12

Systems similarities - they do batch and streaming JAVA Background SCALA JAVA SCALA PYTHON R PYTHON 13

Apache Spark vs Apache Flink - differences batch to Streaming streaming to Batch and iterations, memory management, user policies we do batch Background 14

Peel Experiment Peel Framework - The Benchmarking Software submits config by dependency injection packages together by peel bundles Background 15

Peel execution flow - the suite:run command SETUP SUITE Background 16

Peel execution flow - turn on systems SETUP SUITE SETUP EXP SETUP RUN Background EXPERIMENT SCOPE 17

Peel execution flow - collect logs and run again SETUP SUITE SETUP EXP EXPERIMENT SCOPE SETUP RUN EXECUTE RUN to next RUN Background TEARDOWN RUN LOGS 18

Peel execution flow - turn off systems TEARDOWN SUITE SETUP SUITE SETUP EXP to next EXP TEARDOWN EXP EXPERIMENT SCOPE SETUP RUN EXECUTE RUN to next RUN Background TEARDOWN RUN LOGS 19

shee - fast and furios peel data visualization tool https://github.com/spi-x-i/shee built on top of Python, Pandas and matplotlib APIs node - level cluster - level web UI Background 20

shee - fast and furios peel data visualization tool https://github.com/spi-x-i/shee built on top of Python, Pandas and matplotlib APIs node - level cluster - level web UI Background 21

shee - fast and furios peel data visualization tool https://github.com/spi-x-i/shee built on top of Python, Pandas and matplotlib APIs node - level cluster - level web UI Background 22

shee - fast and furios peel data visualization tool https://github.com/spi-x-i/shee built on top of Python, Pandas and matplotlib APIs node - level cluster - level web UI Background 23

shee - fast and furios peel data visualization tool https://github.com/spi-x-i/shee built on top of Python, Pandas and matplotlib APIs node - level cluster - level web UI 1 Star Background 48 Commit 1 Fork 3 Contributors 24

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Defining Experiments https://gitlab.tubit.tu-berlin.de/andrea-spina/mlbenchmark

The fairness constraint Apache Spark 1.6.2 - Apache Flink 1.0.3 We want the same (as much as possible)... data structures pipeline for solvers operators configuration parameters environment Guaranteed by Peel Defining Experiments 26

Experiments Overview - Four applications APPROACH Regression Supervised Learning Not Supervised Learning Recommendation System Multiple Linear Regression Support Vector Machine * KMeans Alternating Least Squares Apache SystemML Apache Spark Apache Spark Apache Flink We want to cover many applications ALGORITHM Choosed by a Tradeoff between complexity and fairness DATA GENERATION Writing Data on-demand by Peel Framework Defining Experiments 27

Building the Experiment Pipeline - KMeans Example PREMISE We always evaluate Training Phase Performance Defining Experiments 28

Building the KMeans Pipeline - Studying KMEANS clustering find new classes from unlabeled data by grouping ASSIGNMENT ASSIGNMENT STEP re-partition datapoints according to centroids Defining Experiments UPDATE UPDATE STEP retrieve new centroids by datapoints location mean 29

Building the KMeans Pipeline - Studying 1. Explore systems machine learning libraries 2. Do research! ML E.g. keeping smarter initial k centroids choice random KMeans ++ KMeans INITIAL CENTROIDS KMEANS ITERATION What do we want to compare? What keeps Systems on Stress! Defining Experiments 30

Building the KMeans Pipeline - Data Structures INIT CENTERS DATA EXPERIMENT SCOPE Dataset Point vectors p0 x0 x1 xn-1 p1 x0 x1 xn-1 Init centers (id, vector) c0 id0 x0 x1 xn-1 c1 id1 x0 x1 xn-1 Defining Experiments We need to: We employed: model data Flink Vectors operate on data Spark Vectors Breeze Vectors Scala Arrays 31

Building the KMeans Pipeline - KMeans Iteration INIT CENTERS INPUT EXPERIMENT SCOPE CACHING RDD/DataSet DATA ASSIGNMENT STEP UPDATE STEP SPARK Map Function ReduceByKey Map Function Aggregate by AVG Broadcasting Function FLINK RichMapFunction GroupBy Reduce Map Function Aggregate by AVG Broadcasting Function Defining Experiments 32

Building the KMeans Pipeline - Materializing INIT CENTERS INPUT EXPERIMENT SCOPE CACHING KMEANS ITERATIONS MODEL RDD/DataSet DATA Model final centers Defining Experiments 33

Building the KMeans Pipeline - Validation INIT CENTERS INPUT EXPERIMENT SCOPE CACHING KMEANS ITERATIONS OUTPUT MODEL RDD/DataSet DATA Model Evaluation fairness first convergence of metrics good-enough model then Defining Experiments Evaluation Metrics 34

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Benchmarking and Results Analysis

Some general insights TOTAL RUNTIME [h] CLUSTERS DATASETS 200 2 28 WALLY nodes 30 CPUs/node 8 RAM/node 16GB Storage/no. 3x1TB Eth 1Gbit 96 104 TUNING FLINK 22% SPARK 30% CLOUD-11 nodes 25 CPUs/node 16 RAM/node 32GB Storage/no. 2x1TB Eth 1Gbit Benchmarking and Results Analysis AVERAGE SIZE 275GB EVALUATIONS strong scale weak scale data scale 36

Spark versus Flink Summary 34 RUNTIMEaaaaaaaa WINS Multiple Linear Regression Spark v Flink 8-1 Spark 63% outperforms Flink Flink 74% faster on critic resources FlinkML provides better runtimes KMeans Spark v Flink 10-7 8 Support Vector Machine Spark v Flink 16-0 Spark 71% outperforms Flink Flink likes MORE Data Good Scalability Behavior Reccomendation System Similar Performance Flink definitely likes MORE data Flink 11% faster on critic resources Benchmarking and Results Analysis NOT COMPARABLE 37

Spark versus Flink Summary 34 RUNTIMEaaaaaaaa WINS Multiple Linear Regression Spark v Flink 8-1 Spark 63% outperforms Flink Flink 74% faster on critic resources FlinkML provides better runtimes KMeans Spark v Flink 10-7 8 Support Vector Machine Spark v Flink 16-0 Spark 71% outperforms Flink Flink likes MORE Data Good Scalability Behavior Alternating Least Squares Similar Performance Flink definitely likes MORE data Flink 11% faster on critic resources Benchmarking and Results Analysis NOT COMPARABLE 38

KMeans strong scale and scale data 12GB RAM per node 8 core CPU per node Benchmarking and Results Analysis sparsity 0% model size 100 39

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Insights from Executions

How should a large-scale processing engine work? CPU user DISK read / write Flink KMeans 30 centroids 5 iterations 1TB data 30 nodes 12GB RAM / node NETWORK send / recv Insights from Executions 41

Like This KMeans Experiment! A Step A A: Building Execution Pipeline Reading from Source Map Points to Breeze Insights from Executions 42

Step B - The master sends partitions across the cluster B B: Repartition Data Cache Data and spill to Disk First Iteration Execution Insights from Executions 43

Regular Iterations Frequency - Step C C C: Next Iterations (2 to 5) Read spilled data Broadcast the model Produce Sink (write to disk) Insights from Executions 44

How Spark outperforms Flink - #1 Repartitioning SVM - 6 Nodes - 212GB Dataset - 5 iterations - 30 nodes - 12GB RAM / node Insights from Executions 45

#1 Repartitioning - Distributed-to-Distributed Insights from Executions 46

#1 Repartitioning - What Spark does Narrow Transformation GOAL - no data movements Insights from Executions 47

#1 Repartitioning - What Flink does Narrow Transformation GOAL - no data movements Insights from Executions Shuffling data between nodes 48

#1 Repartitioning - Flink Network Overhead Insights from Executions 49

Additional Relevants #2 Caching Flink Issue - FLINK-1730 https://issues.apache.org/jira/browse/flink-1730 Spark - user-defined caching returns faster intra-iteration timing Flink manages caching internally (Bulk Iterations) and it is slower when the data is not Big #3 Broadcasting Flink Improvements Proposal - FLIP-5 https://cwiki.apache.org/confluence/display/flink/flip-5%3a+only+send+data+to+each+taskmanager+once+for+br oadcasts Flink Broadcast brings communication overhead Anyway it was not critical to this benchmark Insights from Executions 50

Benchmarking Apache Flink and Apache Spark DataFlow Systems on Large-Scale Distributed Machine Learning Algorithms Conclusions

Main Considerations Currently Spark is the right choice for batch purposes and now Spark 2.0 Flink was born to stream and is growing along streaming need to find a tradeoff Flink put first robustness and availability and it masters join, hashing, grouping Spark put first performance and efficiency Background 52

Thank You

Media https://blog.websummit.net/berlin-the-startup-city-guide/ - background image - pag.2 https://whatsthebigdata.com/2013/03/18/processing-big-data-the-google-way/ - background image - pag.4 https://www.mapr.com/sites/default/files/blogimages/spark-core-stack-db.jpg - spark stack - pag.11 https://flink.apache.org/img/flink-stack-frontpage.png - flink stack -pag.11 http://www.hostingtalk.it/wp-content/uploads/2016/04/machine_learning.png - background image - pag. 28 http://static.wixstatic.com/media/53defd_17c4b53bdda34dd89eed13867b9cc1aa~mv2.jpg - background image - pag.51 http://www.trustsecurity.co.uk/admin/resources/monitoring-w680h300.jpg - background image - pag.61 Media 54

Bonus Slides 55

Peel Framework Deploy Flow Defining Experiments 56

Peel execution flow - the suite:run command SETUP SUITE Background 57

Peel execution flow - turn on systems SETUP SUITE SETUP EXP SETUP RUN Background EXPERIMENT SCOPE 58

Peel execution flow - collect logs and run again SETUP SUITE SETUP EXP EXPERIMENT SCOPE SETUP RUN EXECUTE RUN to next RUN Background TEARDOWN RUN LOGS 59

Peel execution flow - turn off systems TEARDOWN SUITE SETUP SUITE SETUP EXP to next EXP TEARDOWN EXP EXPERIMENT SCOPE SETUP RUN EXECUTE RUN to next RUN Background TEARDOWN RUN LOGS 60

Peel execution flow - It enables context fairness 1 SETUP SUITE 2 SETUP EXP 3 SETUP RUN to next EXP 5 TEARDOWN EXP 4 EXPERIMENT SCOPE EXECUTE RUN to next RUN Background TEARDOWN SUITE TEARDOWN RUN LOGS 61

The KMeans Theory Defining Experiments 62

Building the Experiment Pipeline - KMeans Example KMEANS clustering find new classes from unlabeled data by grouping ASSIGNMENT ASSIGNMENT STEP re-partition datapoints according to centroids Defining Experiments UPDATE UPDATE STEP retrieve new centroids by datapoints location mean 63

KMeans Workload Code Bonus Slides 64

Building the KMeans Pipeline - Data Structures INIT CENTERS EXPERIMENT SCOPE DATA * the project is developed in Scala Bonus Slides 65

@ForwardedFields(Array("*->_2")) final class CommonSelectNearestCenter extends RichMapFunction[BDVector[Double], (Int, BDVector[Double], Long)] { private var centroids: Traversable[(Int, BDVector[Double])] = null /** reads centroids and indexing values from the broadcasted set **/ override def open(parameters: Configuration): Unit = { centroids = getruntimecontext.getbroadcastvariable[(int, BDVector[Double])]("centroids").asScala } override def map(point: BDVector[Double]): (Int, BDVector[Double], Long) = { var mindistance: Double = Double.MaxValue var closestcentroidid: Int = -1 for ((idx, centroid) <- centroids) { val distance = squareddistance(point, centroid) if (distance < mindistance) { mindistance = distance closestcentroidid = idx }} (closestcentroidid, point, 1L) }} KMeans Iteration #1 val finalcentroids: DataSet[(Int, BDVector[Double])] = centroids.iterate(iterations) { currentcentroids => val newcentroids = points.map(new CommonSelectNearestCenter).withBroadcastSet(currentCentroids, "centroids") /** **/ Bonus Slides 66

while(iterations < maxiterations) { val bccentroids = data.context.broadcast(currentcentroids) val newcentroids: RDD[(Int, (BDVector[Double], Long))] = data.map (point => { var mindistance: Double = Double.MaxValue var closestcentroidid: Int = -1 val centers = bccentroids.value centers.foreach(c => { // c = (idx, centroid) val distance = squareddistance(point, c._2) if (distance < mindistance) { mindistance = distance closestcentroidid = c._1 } }) KMeans Iteration #1 (closestcentroidid, (point, 1L)) }) /** **/ Bonus Slides 67

val finalcentroids: DataSet[(Int, BDVector[Double])] = centroids.iterate(iterations) { currentcentroids => val newcentroids = points.map(new CommonSelectNearestCenter).withBroadcastSet(currentCentroids, "centroids").groupby(0).reduce((p1, p2) => { (p1._1, p1._2 + p2._2, p1._3 + p2._3)}).withforwardedfields("_1") /** **/ (closestcentroidid, (point, 1L)) }).reducebykey(mergecontribs) KMeans Iteration #2 type WeightedPoint = (BDVector[Double], Long) def mergecontribs(x: WeightedPoint, y: WeightedPoint): WeightedPoint = { (x._1 + y._1, x._2 + y._2) } Bonus Slides 68

KMeans Iteration #2 val avgnewcentroids = newcentroids.map(x => { val avgcenter = x._2 / x._3.todouble (x._1, avgcenter) }).withforwardedfields("_1") avgnewcentroids Bonus Slides currentcentroids = newcentroids.map(x => { val (center, count) = x._2 val avgcenter = center / count.todouble (x._1, avgcenter) }).collect() iterations += 1 69

When Experiment Definition Goes Wrong Defining Experiments 70

SVM and Gradient Descent: what we wanted to do ORIGINAL IDEA Gradient Descent + mini-batching 1. sampling not comparable Custom and Common Sampler 2. mappartitions over mini-batches p0 Defining Experiments p1... pn shuffle indices and subset local gd gradient global gd update 71

SVM and Gradient Descent: what we did ISSUE Spark not able to Run mappartitions OutOfMemory Exception to Batch Gradient Descent p0 Defining Experiments p1... pn shuffle indices and subset local gd gradient global gd update 72

When Experiment Definition Goes Wrong Defining Experiments 73

The Supervised Learning Framework Defining Experiments 74

Other Results Defining Experiments 75

Spark versus Flink Summary 34 RUNTIMEaaaaaaaa WINS Multiple Linear Regression Spark v Flink 8-1 Spark 63% outperforms Flink Flink 74% faster on critic resources FlinkML provides better runtimes KMeans Spark v Flink 10-7 8 Support Vector Machine Spark v Flink 16-0 Spark 71% outperforms Flink Flink likes MORE Data Alternating Least Squares Similar Performance Flink definitely likes MORE data Flink 11% faster on critic resources Benchmarking and Results Analysis NOT COMPARABLE 76

Multiple Linear Regression strong scale CLOUD-11 (25nodes) 28GB RAM per node 16 core CPU per node DATASET INFO ALGORITHM INFO 7 no. datapoints 10 model size 1000 Benchmarking and Results Analysis data size 80GB sparsity 30% Iterations 100 77

Spark versus Flink Summary 34 RUNTIMEaaaaaaaa WINS Multiple Linear Regression Spark v Flink 8-1 Spark 63% outperforms Flink Flink 74% faster on critic resources FlinkML provides better runtimes KMeans Spark v Flink 10-7 8 Support Vector Machine Spark v Flink 16-0 Spark 71% outperforms Flink Flink likes MORE Data Alternating Least Squares Similar Performance Flink definitely likes MORE data Flink 11% faster on critic resources Benchmarking and Results Analysis NOT COMPARABLE 78

Support Vector Machine strong scale and weak scale 12GB RAM per node 8 core CPU per node Benchmarking and Results Analysis sparsity 0% model size 1000 79

Future Developments Defining Experiments 80

Future Improvements Complete not comparable benchmarking Redefine ALS benchmarking Add not Included Systems Improve shee and integrate it in peel framework Bonus Slides 81