Performance Modeling of IoT Applications

Performance Modeling of IoT Applications Dr. Subhasri Duttagupta, TCS www.cmgindia.org 1

Contents Introduction to IoT System Performance Modeling of an IoT platform Performance Modeling of a sample IoT Application Performance Modeling of a real-life WSN system Summary 2

Motivating Examples Scalability Analysis WNA Lab, Amrita University End-to-End Delay Analysis 3

Elements of a Typical IoT Platform Sensor Data Management Developers Device Agents LWM2M Device Agents Device Management Message Routing & Event Processing Apps API RESTful http(s), tcp, udp, mqtt OPC-UA, Modbus, Continua Analytics Things with Embedded Sensors Gateway Devices Cloud Services Apps, Clients & Portals Mobile Devices A high performance, scalable platform for Internet-of-Things 4

Questions that Performance Modelling can answer When does any of the subsystem become a bottleneck As the sensor data rate increases As the number of queries injected by users increases How many VMs a particular subsystem needs to handle a certain load Is the SLA going to be met for a certain growth in the number of users What kind of performance modelling is useful in a specific situation 5

Challenges in Performance Modelling of IoT Diverse Technology Huge number of Smart Devices of various types Lack of suitable testing platform and tool for testing End-to-End system Difficult to Predict exact workload mix Addition of frequent new services and devices Changes in Deployment platform 6

Modeling Background Open Systems characterized by rate of arrival Closed Systems are characterized by No of users and think time Input to the Model Service Demand Amount of CPU/Disk time spent in serving one unit of Output (Utilization of resource / Throughput) 7

What Questions Modelling helps in Answering Which component becomes the bottleneck when the system is handling peak data rate from a number of sensors How many No of VMs required at each layer For a certain no of sensors with a response time SLA For a certain no of API clients and with a Response time SLA Can the system handle a certain growth of users accessing the IoT services Can the platform support different types of APIs (random access query, range query, sequential scan query) simultaneously without affecting the performance 8

Steps in Modelling Exercise 1. Understand the architecture Identify the components that are significant 2. Analyse the commonly used workloads and their parameters 3. Do performance test or analyse available data to obtain Service Demands for each type of workload 4. Analyse workload to find out any variation in service demands based on certain conditions 5. Decide the flow of requests within the model Attach probability to various alternate flows 9

Architecture Diagram of a Subsystem Sensor Observation Services API ID Generation Service Pulls and Queues ID ID Spring MVC Spring MVC SOS SOS Phoenix Phoenix JDBC JDBC Derby Derby Tomcat NIO Servers Tomcat NIO Servers REST Clients Observation & Audit Trails Hazelcast Distributed Cache Message Exchange Hbase Cluster Phoenix Coprocessor on Region Servers 10

Workloads for Sensor Observation Services SOS Different APIs GetObs latest, by Sensor, Time range PostObs Get/Post Sensor Get Feature get features of sensors Get Capability get capability of SOS Find out whether API output depends on the parameters passed 11

JMT: Powerful Java Modelling Tool Developed since 2002 by 10+ generations of PG and UG students at Politecnico di Milano and Imperial College London http://jmt.sourceforge.net/ JMT is open source: GPL v2 size: ~4,000 classes; 21MB code; ~200k lines Download the jar file and simply run java jar JMT.tar M.Bertoli, G.Casale, G.Serazzi. JMT: performance engineering tools for system modeling. ACM SIGMETRICS Performance Evaluation Review, Volume 36 Issue 4, New York, US, March 2009, 10-15, ACM press 12

JMT Java Modeling Tools JSIMgraph - Queueing network models simulator with graphical user interface JSIMwiz - Queueing network models simulator with wizard-based user interface JMVA - Mean Value Analysis and Approximate solution algorithms for queueing network models JABA - Asymptotic Analysis and bottlenecks identification of queueing network models JWAT - Workload characterization from log data JMCH - Markov chain simulator 13

Analytical Modeling of SOS Model Delay stations Queuing stations Postgres version HBase version 14

Modeling of SOS with PostGres Think time No of Threads SOS Modules Performance Testing Tool Throughput Utilization Predicting Max Throughput, Response time for a specific Deployment Predicting performance for a different Deployment Use of Performance Mimicking Benchmarks [Duttagupta, IoT 2016] 15

Performance of a Single API on AWS (using Postgres) Throughput (Txn/sec) Response time (in ms) 140 120 100 PostObs on AWS 3500 3000 2500 PostObs on AWS 80 60 40 2000 1500 1000 Response time ~ 2s 20 500 0 0 100 200 300 400 500 600 No of Users 0-500 0 100 200 300 400 500 600 No of Users Actual Throughput Predicted Throughput Actual Response Time Predicted Response Time Throughput saturates at 256 users and at 123 trans/sec. Response time increases more than 1 sec beyond 256 users For scaling to higher no of users, we need to add more VMs to Tomcat 16

Throughput Response Time Mixed API for a different Datastore (Hbase) GetObsLatest + PostObs GetObsLatest + PostObs 200 150 100 50 0 0 100 200 300 400 500 600 No of Users 1200 1000 800 600 400 200 0 0 100 200 300 400 500 600 No of Users Actual Response time Predicted Response time Actual Throughput Predicted Throughput PostObs SD=12.5 ms GetObsLatest SD= 5.2 ms GetObsLatest 10% threads and PostObservation 90% threads Modeling helps in performance of Mixed API given that of Single API 17

Throughput (Txn/sec) Response time (in ms) Does API support Horizontal Scalability? PostObs with 2VMs on AWS PostObs with 2VMs on AWS 250 2000 200 150 100 50 0 0 200 400 600 800 1000 No of Users 1500 1000 500 0 Response time ~ 0.5s 0 200 400 600 800 1000 No of Users Actual Throughput Predicted Throughput Actual Response Time Predicted Response Time With 2 Tomcat VMs, Application scales up-to 512 users. - Model predicts API to have linear scalability and actual test results show that two VMs scales to twice the number of users without increasing response time. 18

Number of VMs Required for a resp time SLA Response time SLA = 1 sec Tomcat layer is scaling horizontally PostGres VM needs to be upgraded to bigger VM beyond 1280 users 19

Modeling Challenge Service demand variability Service Demand varies with higher no of threads, it also depends on the API API No of threads Tomcat Service Demand PostObs 64 11.2 ms 768 8.5 ms 1024 7.2 ms GetObsLatest 64 8.7 ms 256 3.8 ms 512 2.4 ms GetObsbySensor 32 44 ms 64 50.8 ms 128 68.4 ms 20

Modeling of a Rule Processing Engine Factors Impacting Performance Complexity of Rule applied on messages Payload of messages Inter-arrival delay of consecutive messages No of rules applied to an observation No of message producers and consumers Server/VM architecture We consider the effect of message rate and complexity of rules 21

Architecture of a Rule Processing Engine Messages first come to Tenant RabbitMQ based on API keys Then Rule processing engine routes them to various topic exchanges Multiple Topic MQs exist for multiple tenants with same topic 22

Model of a Rule Processing Engine 23

Latency (ms) Performance of a Message Routing System 450 400 350 300 250 200 150 100 50 0 Latency in MQ Module 0 200 400 600 800 1000 1200 1400 Flow rate msg/sec Actual latency Predicted Latency Flow rate Latency RabbitMQ CPU% 900/s 8 ms 80% 975/s 13 ms 87.4% 1050/s 1.9 8 sec 93.5% Message latency is very low until either RabbitMQ or MR VM saturates. Once Util > 90%, Latency can increase rapidly due to Queue build up at the server. 24

Combined Model with SOS sending data to MQ PostObs request gets forked after processing at SOS and gets processed by Rabbit MQ + MR engine Modeled using a combination of Open Request and Closed Request [ACM IoT 2016] 25

We have seen Performance Analysis of two subsystems on IoT Platform Next Performance Analysis of a Sample Application on IoT Platform 26

Sample App Energy Monitoring System 27

Architecture Data flows from Platform to backend 28

Modeling Problems How many more buildings the current infrastructure can support? How many Online users dashboard can support with the present deployment for typical queries? 30

Q: How many more buildings can be supported Data comes from difference sources Occupancy data Energy Meter Readings Find out the distribution of Inter-arrival time of observation SOS log gives timestamps of arrival for each observation Two metrics are calculated from Inter-arrival time samples Mean and Standard Deviation 31

Backend Access Log We extract the timestamps of subsequent observations being posted. With the timestamp, we calculate the inter-arrival time of every observations. 32

Probability Distribution for Inter-arrival time 0.35 0.3 0.25 PDF of Inter-arrival time 0.2 0.15 0.1 0.05 0 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 190 200 210 Interarrival time (in ms) What trend does this distribution reflect? How do we compute parameters corresponding to this distribution? If it is Exponential distribution and Mean = 37.2, std deviation should have been also 37.2 but it is 45.6 33

What is hyper-exponential distribution We need to find out a set of exponential distributions and their probability to match the distribution of data n i=0 p i λ i 34

Fitting distribution to Data We can find out µ 1 µ 2 and their probabilities so that it matches with desired mean and standard deviation. µ 1 = 22, p1 = 0.6 We obtain mean = 37.2 and STD = 45.6 µ 2 = 60.1, p2 = 0.4 But Ideally we should use a function to obtain the correct mixture so that it matches the shape of the distribution. 35

Q: How many online users can be supported What if Performance Testing is not an option? Rely on Utilization monitoring interface Find the Trend of the utilization data to derive Min, Max, Average. History is for 1 day, last 7 days or 1 month Is the utilization due to one type of workload? 36

When No Dashboard Queries are running Utilization is due to data injection and Alert Queries Higher utilization at every 25-30 mins and remains high for 15 mins, CPU% varies between 11%-20% 37

Trend over a day what s max, average? 38

Deriving Service demands for Different Workloads We compute the service demand based on mean throughput derived from the log Front-end Client node handles traffic from only Dashboard Backend Data node handles traffic from Dashboard as well as from the sensor backend Sensor Data Dashboard Queries ES Backend Alert Queries 39

Deriving Service demands for ES datanode Use of Least Square Technique U ES = X Data D Data + X Alert D Alert + X DashQ D DashQ X Data = Throughput for Data Insertion D Data = Service Demand for Data insertion 40

Model for Energy Monitoring Subsystem 41

Take aways from Modelling of Sample App Throughput in an Open system mostly remains unchanged Iowait% occurs due to logging of Debug/Info Utilization over a short duration can become very high even at low Concurrency Model can be built based on data from production log and utilization information 42

What other Modeling Techniques we can use Markov Chains can be used when we can model system as a set of states Need to know States, Their Transition rates Example of a real-life Land slide Monitoring systems A Number of different types of sensors are used Rain gauge, pore Pressure, Humidity, Movement Makes decisions based on the value of sensor readings 43

Deriving Parameters for the Markov Chain r(t) < Th 1 Power drain OFF Power drain S r S 234 ON r(t) > Th 1 r(t) > Th 3 p(t) > Th p r(t) < Th 3 r(t) > Th 2 m(t) > Th m S rm S mp r(t) < Th 2 WNA Lab, Amrita University 44

Summary Things we covered Basics of Performance Modeling using Queuing Networks Given an architecture, how to build the model for the system Performance Modeling of an IoT Platform Closed System, Open System, No of VMs required, Scalability analysis Performance Modeling of a Sample Application running on an IoT Platform Gathering inter-arrival time distribution, Deriving service demands of workload Outcome of a Performance Model Other Modeling Techniques Markov Chains 45

Important Resources M.Bertoli, G.Casale, G.Serazzi. User-Friendly Approach to Capacity Planning Studies with Java Modelling Tools. Int.l ICST Conf. on Simulation Tools and Techniques, SIMUTools 2009, Rome, Italy, 2009, ACM press S. Kounev, and A. Buchmann, Performance modeling and evaluation of large-scale J2EE applications, In Proceedings of the Computer Measurement Group's Conference, 2003. E. Lazowska, J. Zahorjan, G. Graham and K. Sevcik, Quantitative System Performance: Computer System Analysis Using Queueing Network Models, Prentice-Hall, 1984 Performance Modeling and Design of Computer Systems: Queueing Theory in Action, by Prof. Mor Harchol-Balter A gentle introduction to some basic queuing concepts, by William Stallings. Automatically Determining Load Test Duration Using Confidence Intervals, R Mansharamani, S Duttagupta, A Nehete, CMG India, Pune, 2014 Subhasri Duttagupta, Rajesh Mansharamani. Extrapolation Tool for Load Testing Results, Int. Symposium for Performance Evaluation of Computer System and Telecommunication System, 2011 Subhasri Duttagupta, Mukund Kumar and Manoj Nambiar, Performance Modeling of IoT applications, 6 th ACM Conference on Internet of Things, IoT 2016. 46

Open Issues How to take care of variability of technology used by various sensors for connecting to IoT system How Service demand varies with load or higher flow rate What are the fundamental limits for a technology stack for a certain kind of workload 47