Behind the scenes - Continuous log analytics

Similar documents
Clojure is. A dynamic, LISP-based. programming language. running on the JVM

Functional Programming and the Web

Understanding MySQL storage and clustering in QueueMetrics

Copyright 2014 Blue Net Corporation. All rights reserved

Our legacy archival system resides in an Access Database lovingly named The Beast. Having the data in a database provides the opportunity and ability

Go Serverless: Design Patterns, Best Practices and Real-World Scenarios

Clojure Web Security. FrOSCon Joy Clark & Simon Kölsch

Microservice Splitting the Monolith. Software Engineering II Sharif University of Technology MohammadAmin Fazli

Introduction to yada. Malcolm

Coroutines & Data Stream Processing

ECE 353 Lab 3. (A) To gain further practice in writing C programs, this time of a more advanced nature than seen before.

ENGR 40M Project 3c: Switch debouncing

PROCE55 Mobile: Web API App. Web API.

June 27, 2014 EuroClojure 2014 Krakow, Poland. Components. Just Enough

Servlet Performance and Apache JServ

Introduction Installation and Startup JMeter GUI

Clojure Lisp for the Real clojure.com

What if Type Systems were more like Linters?

itexamdump 최고이자최신인 IT 인증시험덤프 일년무료업데이트서비스제공

Click to edit Master title style

Screw You and the Script You Rode in On

Time Series Storage with Apache Kudu (incubating)

CIS 612 Advanced Topics in Database Big Data Project Lawrence Ni, Priya Patil, James Tench

CS 288: Statistical NLP Assignment 1: Language Modeling

Mail: Web: juergen-schuster-it.de

Own change. TECHNICAL WHITE PAPER Data Integration With REST API

Azure Data Factory. Data Integration in the Cloud

assembler Machine Code Object Files linker Executable File

Using Redis As a Time Series Database

AWS Lambda: Event-driven Code in the Cloud

Other architectures are externally built or expanded

Design document. Table of content. Introduction. System Architecture. Parser. Predictions GUI. Data storage. Updated content GUI.

GAVIN KING RED HAT CEYLON SWARM

Streaming ETL of High-Velocity Big Data Using SAS Event Stream Processing and SAS Viya

Clojure Concurrency Constructs. CSCI 5828: Foundations of Software Engineering Lecture 12 10/02/2014

whitepaper Using Redis As a Time Series Database: Why and How

Evolution of an Apache Spark Architecture for Processing Game Data

google SEO UpdatE the RiSE Of NOt provided and hummingbird october 2013

Clojure. A Dynamic Programming Language for the JVM. Rich Hickey

Session 8. Reading and Reference. en.wikipedia.org/wiki/list_of_http_headers. en.wikipedia.org/wiki/http_status_codes

Analysis of the effects of removing redundant header information in persistent HTTP connections

Amazon S3 Glacier. Developer Guide API Version

CS1622. Semantic Analysis. The Compiler So Far. Lecture 15 Semantic Analysis. How to build symbol tables How to use them to find

Serverless Website Publishing with AWS Code* Services. Steffen Grunwald Solutions Architect, AWS October 27, 2016

Optimized Data Integration for the MSO Market

Netalyzr Updates. Christian Kreibich (ICSI), Nicholas Weaver (ICSI), and Vern Paxson (ICSI & UC Berkeley) Netalyzr Updates

Principles of Programming Languages. Objective-C. Joris Kluivers

Prototyping Data Intensive Apps: TrendingTopics.org

Clojure. A Dynamic Programming Language for the JVM. (and CLR) Rich Hickey

Chrome if I want to. What that should do, is have my specifications run against four different instances of Chrome, in parallel.

Macros in sbt: Problem solved!

Secure Parameter Filter (SPF) (AKA Protecting Vulnerable Applications with IIS7) Justin Clarke, Andrew Carey Nairn

A Small Web Server. Programming II - Elixir Version. Johan Montelius. Spring Term 2018

Performance Best Practices Paper for IBM Tivoli Directory Integrator v6.1 and v6.1.1

Copyright 2016 Pivotal. All rights reserved. Cloud Native Design. Includes 12 Factor Apps

Gabriel Villa. Architecting an Analytics Solution on AWS

NetRexx on the Big Iron

MCSA SQL SERVER 2012

Serverless Architecture Hochskalierbare Anwendungen ohne Server. Sascha Möllering, Solutions Architect

CPS122 Lecture: From Python to Java last revised January 4, Objectives:

Notes from a Short Introductory Lecture on Scala (Based on Programming in Scala, 2nd Ed.)

Java Applets, etc. Instructor: Dmitri A. Gusev. Fall Lecture 25, December 5, CS 502: Computers and Communications Technology

CSCE 120: Learning To Code

Functional Programming Invades Architecture. George Fairbanks SATURN May 2017

Server Installation Guide

RESTful Web services

I Want To Go Faster! A Beginner s Guide to Indexing

Jaql. Kevin Beyer, Vuk Ercegovac, Eugene Shekita, Jun Rao, Ning Li, Sandeep Tata. IBM Almaden Research Center

Ruby: Introduction, Basics

Mail: Web: juergen-schuster-it.de

Profilers and Debuggers. Introductory Material. One-Slide Summary

BIG DATA. Using the Lambda Architecture on a Big Data Platform to Improve Mobile Campaign Management. Author: Sandesh Deshmane

Diving into AWS Lambda

RELEASE NOTES FOR THE Kinetic - Edge & Fog Processing Module (EFM) RELEASE 1.2.0

CSE P 501 Compilers. Java Implementation JVMs, JITs &c Hal Perkins Winter /11/ Hal Perkins & UW CSE V-1

SAS Solutions for the Web: Static and Dynamic Alternatives Matthew Grover, S-Street Consulting, Inc.

Bash command shell language interpreter

Seminar report Google App Engine Submitted in partial fulfillment of the requirement for the award of degree Of CSE

ADVANCED DATABASES CIS 6930 Dr. Markus Schneider. Group 5 Ajantha Ramineni, Sahil Tiwari, Rishabh Jain, Shivang Gupta

Selenium Testing Course Content

Your First Ruby Script

An Introduction to MATLAB See Chapter 1 of Gilat

DB2 is a complex system, with a major impact upon your processing environment. There are substantial performance and instrumentation changes in

CPL 2016, week 10. Clojure functional core. Oleg Batrashev. April 11, Institute of Computer Science, Tartu, Estonia

Gain Insights From Unstructured Data Using Pivotal HD. Copyright 2013 EMC Corporation. All rights reserved.

Spark 2. Alexey Zinovyev, Java/BigData Trainer in EPAM

Web Applications. Software Engineering 2017 Alessio Gambi - Saarland University

Sunday, May 1,

Architectural challenges for building a low latency, scalable multi-tenant data warehouse

Performance Monitoring and Management of Microservices on Docker Ecosystem

Analysis Tool Project

Rudy: a small web server. Johan Montelius. October 2, 2016

Overview of OOP. Dr. Zhang COSC 1436 Summer, /18/2017

Accessing the Progress OpenEdge AppServer. From Progress Rollbase. Using Object Script

Unit 2 : Computer and Operating System Structure

Why Prismatic Goes Faster With Clojure

Facebook Product Catalog Configuration Instructions

Amazon Glacier. Developer Guide API Version

CPS122 Lecture: From Python to Java

REST - Representational State Transfer

Transcription:

Behind the scenes - Continuous log analytics Here at Loway, we take data very seriously. QueueMetrics Live, our hosted call-center analytics platform for Asterisk PBXs, is constantly monitored and probed for anomalies in its performance metrics. This way, we are able to find and fix issues - often automatically - before they become apparent to our growing number of users. The goals During the years we developed a number of scripts using different technologies (Perl, bash, Ruby) that go search for specific patterns of interest across our logs, categorize them and push them to an InfluxDB database for central anomaly detection. As our platform is built on top of a

good number of servers that are commissioned and decommissioned according to expected occupancy and behavioral patterns, we needed a way to correlate events happening on different layers and understand whether they are caused by anomalies at the customer level or at the server level. Most of these scripts read data that comes in as HTTP logs, where we go look for specific parameters (type of request, elapsed time, response size, errors.) and we categorize these calls into different buckets. Categorization rules evolve and change over time (for example: all QueueMetrics instances are monitored by feeding in transactions and checking that results are correct and returned within a specific time frame, but we make a distinction between transactions run for automated monitoring versus transactions started by our customers), while the basic format for all our queries is always reading incremental logs, processing new lines and returning results to InfluxDB. We saw an opportunity for Clojure as a way to do this task, as it seemed well suited to ETL-style data categorization and transformation, and being very terse means that the code that has to be maintained ended up being smaller than the Ruby equivalent. Implementation Is it fast enough? As Clojure runs on the JVM, it has performance characteristics that match the JVM s - so it is very efficient in working on large homogeneous batches, but we feared its startup times would pose an excessive load especially on busy (and not especially powerful) web proxies. So we run some tests by assembling together all the libraries we needed and initializing them, to measure the cost of Clojure : $ time java -jar target/l2i-standalone.jar real user sys 0m1.986s 0m3.976s 0m0.213s

This ends up being acceptable, because those jobs are called by a cron at given intevals; as these are dry runs in which no data is actually processed, this is a bottom-line fixed cost for adoption. Log processing in Clojure Once we saw that the Clojure + JVM were an acceptable environment that we could run on our production servers without excessive load, we started analyzing logs with Clojure. And this experience was very good. (:require [clash.core :as clash]) (defrecord LogRecord [ip date time method uri referrer status useragent bytes]) (def log-pattern #"^(\S+) \S+ \S+ \[(...):(...).+?\].(\S+) (\S+) \S+ (\d+) (\d+) \"(\S+)\" \"(.+?)\"") (defn log-parser [line] (if-let [[_ i d t m u s b r ua] (re-find log-pattern line)] (->LogRecord i d t (keyword m) u r (keyword s) ua (parse-int b)))) (defn load-log-verbose [file] (clash/transform-lines-verbose file log-parser :max 10000000)) This is basically all the logic you need to parse file and store them into a common parsed structure. We created a LogRecord to store data, a parser that uses a regexp to find data in logs, and we use the excellent (and quite performant) library Clash to parse them very efficiently inmemory. Clash then gives us the results of parsing and a list of lines where parsing failed, that we store for further inspection. As we only want to read incremental lines, we save a brainfile that stores the file we are parsing, a SHA of its first line and the current offset. If the file remains the same (that it, the first

line is not changed) we read from the current offset onwards and write a new brainfile; if it has changed, we read it from the beginning. This required very little code: using Cheshire for reading/writing JSON and Clojure Spec we can easily handle the case where the file is non existent and return a dummy file instead of an exception. We can also check (as a post-condition) that what we read was actually the right kind of file. (defn load-brain-file [filename] {:post [(spec/valid? ::brainfile %)]} (try (let [contents (slurp filename)] (json/parse-string contents (fn [k] (keyword k)))) (catch Exception e {:sha "" :offset 0}))) (defn save-brain-file [brainfile filename] (let [offset (file-size filename) sha (generate-file-signature filename) brain {:sha sha :offset offset :filename filename :lastupdate (str (java.util.date.))}] (json/generate-stream brain (clojure.java.io/writer brainfile))))

Business logic Business logic - for each of the cases supported - ends up being a single, stateless function that maps over a sequence of LogRecords and returns a set of measurements to be added to InfluxDB. It is usually made up of a function (in our case below named downloads-which ) that returns a hash of attributes or n i l if we don t care about this specific record; then nils are stripped out and we count unique hashes. (defn downloads-parse [logrecords] (let [items (map downloads-which logrecords) items-present (filter some? items)] (frequencies items-present))) To run the loader, we pack everything into a single Uberjar and parse its command-line parameters. By using clojure.tools.cli creating a well-behaved command line tool with help and validation took us very little time. As we have multiple distinct pipelines that measure different elements (e.g. instance activity, download links, data uploads, etc) and a server might have data for more than one pipeline, to avoid paying the price of starting the loader multiple times we create a multibrain file that is a JSON representation of an array of command line parameters that we want invoked in sequence. We validate our JSON input using Spec and then share most of the code with our command-line parser. Final notes The resulting script is packaged as an Uberjar; the resulting size is acceptable (~7M) though we had to be careful with transitive dependencies (e.g. the innocent looking URL-parsing library Cemerik made it 3x as big because it imports a lot of other libraries we did not need, and so we preferred apache.http.client using Java interop instead). In the end, we run our loader with a simple cron job and a customized multibrain file for each server. When updating the loader, we deploy a new JAR file and it just works.

Developing analytics with a REPL proved to be a productive experience, because you can immediately see the results of what you are doing, so it is relatively easy to see the details and test different approaches. When then you find something, you can create a test so you are sure the behavior won t change by mistake with further interactions. Java interop was also very important in getting things done, as it gave us a huge pool of libraries to choose from and made it possible to do low-level file access when needed. Try QueueMetrics suite service https://www.queuemetrics.com/buy.jsp