Diving into Big Data Workshop

Similar documents
Building a Django Twilio Programmable Chat Application

MIT AITI Python Software Development Lab DJ1:

webkitpony Documentation

Ninja Typers Web Application Design Document By Marvin Farrell C

DBNsim. Giorgio Giuffrè. 0 Abstract How to run it on your machine How to contribute... 2

Tomasz Szumlak WFiIS AGH 23/10/2017, Kraków

Software Test Plan Version 1.0

Django Web Framework: A Comprehensive Introduction and Step by Step Installation

Django PAM Documentation

The Structure of the Web. Jim and Matthew

Django urls Django Girls Tutorial

Django_template3d Documentation

Technology modeling. Ralf Lämmel Software Languages Team University of Koblenz-Landau

Rapid Development with Django and App Engine. Guido van Rossum May 28, 2008

kaleo Documentation Release 1.5 Eldarion

django-model-report Documentation

Python lab session 1

django-scaffold Documentation

Django Part II SPARCS 11 undead. Greatly Inspired by SPARCS 10 hodduc

open-helpdesk Documentation

DIGIT.B4 Big Data PoC

EXPERIENCES MOVING FROM DJANGO TO FLASK

Oracle Data Visualization SDK

ArcGIS for Server: What s New. Philip Heede, Jay Theodore

Django File Picker Documentation

Procedure to Flash Upgrade the TLink TL250/TL300

Amazon WorkSpaces Application Manager. Administration Guide

Homework 6: Pose Tracking EE267 Virtual Reality 2018

Colligo Console. Administrator Guide

Back-end development. Outline. Example API: Chatter. 1. Design an example API for Chatter. Tiberiu Vilcu.

Tablo Documentation. Release Conservation Biology Institute

Data Visualization With Chart.js. Alireza Mohammadi Fall 2017

A Sample Approach to your Project

django-oauth2-provider Documentation

Installing PHP on Windows 10 Bash and Starting a Local Server

django-dajax Documentation

Without Django. applying django principles to non django projects

SPC Documentation. Release Wesley Brewer

FCS Documentation. Release 1.0 AGH-GLK

Compiere 3.3 Installation Instructions Linux System - Oracle Database

django-xross Documentation

Bitnami OSQA for Huawei Enterprise Cloud

Django Deployment & Tips daybreaker

Django Graphos Documentation

kiss.py Documentation

MATLAB - Lecture # 4

colab Documentation Release 2.0dev Sergio Oliveira

Enabling High-Quality Printing in Web Applications. Tanu Hoque & Jeff Moulds

C exam IBM C IBM Digital Experience 8.5 Fundamentals

BriCS. University of Bristol Cloud Service Simulation Runner. User & Developer Guide. 1 October John Cartlidge & M.

Web Focused Programming With PHP

django-telegram-login Documentation

Firefox for Android. Reviewer s Guide. Contact us:

SIS offline. Getting Started

Iconasys Advanced 360 Product View Creator. User Guide (Mac OSX)

django-baton Documentation

Accelerating Information Technology Innovation

Django IPRestrict Documentation

Seeder Documentation. Release 0.1. Visgean Skeloru

Jupyter and TMVA. Attila Bagoly (Eötvös Loránd University, Hungary) Mentors: Sergei V. Gleyzer Enric Tejedor Saavedra

Django-Select2 Documentation. Nirupam Biswas

Quick Start Guide. Visual Planning 6 New Features

Episode 298. Getting Started With Spree

Installing VS Code. Instructions for the Window OS.

Tangent MicroServices Documentation

DJOAuth2 Documentation

Using Vista, Firefox for browser, Google for search engine and AVG for viruses.

Real-time Data Engineering in the Cloud Exercise Guide

Spade Documentation. Release 0.1. Sam Liu

Virtual CD TS 1 Introduction... 3

Pentaho BioMe App Demo. Installation, Access & Usage Instructions

User Guide Using AuraPlayer

Statirator Documentation

django-oscar-paypal Documentation

Client Side JavaScript and AJAX

Django Test Utils Documentation

TIBCO LiveView Web Getting Started Guide

Compiere 3.3 Installation Instructions Windows System - Oracle Database

The Courier Mail has prepared the information in this document to assist with implementation of our RSS news feeds.

Human-Computer Interaction Design

Organizing Your Network with Netvibes 2009

Firefox for Nokia N900 Reviewer s Guide

django-openid Documentation

Including Dynamic Images in Your Report

ganetimgr Documentation

CSE 115. Introduction to Computer Science I

Using Google Drive Some Basics

MapEntity Documentation

django-baton Documentation

Deploy IBM Spectrum Control Virtual Appliance into VMware ESXi V5.1 IBM

Enabling High-Quality Printing in Web Applications. Tanu Hoque & Craig Williams

SAURASHTRA UNIVERSITY

Operations Dashboard for ArcGIS Monitoring GIS Operations. Michele Lundeen Esri

Lab 4: create a Facebook Messenger bot and connect it to the Watson Conversation service

Easy-select2 Documentation

Manipulating Database Objects

Contents. 1. What is otree? 2. The Shell and Python. 3. Example: simple questionnaire. 4. Example: public goods game. 5. Test bots.

Assignment 2. Start: 15 October 2010 End: 29 October 2010 VSWOT. Server. Spot1 Spot2 Spot3 Spot4. WS-* Spots

micawber Documentation

Let's talk about QRadar Apps: Development & Troubleshooting IBM SECURITY SUPPORT OPEN MIC

Transcription:

Diving into Big Data Workshop Hands-on activity During this activity, you will learn how to obtain live streaming data from the Ocean Observatory Initiative, how to plot it in a web dashboard and how to apply some transformations to the data as it is being continuously received. Learn what are the available instruments with the interactive map: https://goo.gl/jjyb7m Download the starter code: https://anthony-simonet.fr/ooi-starter-code.zip Use users workshop with password workshop2018. 1

Data format Data in OOI is delivered in JSON, which means that each individual value is a JSON document similar to this: { } 'time': 3727684681.9, 'press_trans_temp': 3.21276087,... 'bottom_pressure': 2241.4755859375 time always represents the time when the measure was made, but all other keys depend on the particular instrument you are streaming from. On the interactive map, the list of available keys for each instrument is given under variables. Overview of the code Download and extract the starter code in C:\Users\workshop\ooi-starter-code\ (right-click on the zip file then Extract all ). To edit the code, you can use the Rodeo IDE (a shortcut is located on the Desktop). The project includes a Django web server (for serving web pages) and a Kafka client (for consuming data streams from OOI). The code you received contains the following directories and files (only the most important ones are listed here, the ones you need to edit are bold ): ooi-starter-code/ dashboard/ config.py settings.py urls.py views.py ooi/ client.py static/ css/ js/ templates/ index.html manage.py // Server starter script // Server setting // URL mappings // Template rendering // Static files for the client // Client homepage 2

Instructions Open a Windows command line Your code will be executed indirectly by the Django web server in a command window. To get started: - Open the Windows Start menu and run this command: cmd - In the window that appears, change to the directory where the code is, e.g. C:\Users\workshop\ooi-starter-code by running : cd ooi-starter-code Start/stop the web server To start the server, select the command window and execute: python manage.py runserver You can now view your dashboard by browsing http://localhost:8000 To stop the server, select the command window and press Ctrl+Break. You will likely have to restart the server after making certain changes to your code. Start consuming streaming data Now that your server is running, you have to instantiate an OOIClient object. To do so, make the following changes to dashboard/config.py : Line 14, set the correct Kafka server address ( 128.104.223.162:9092 ); Line 16, set the topics you want to subscribe to as a List of String (see the map to learn about which topics are available: https://goo.gl/jjyb7m ); After line 18, connect and start the server by calling the right methods on self._client (see ooi/client.py for available methods on OOIClient ). After restarting the web server, you should see the following messages on the console: 3

Worker thread started Plot the stream on the client You must now modify templates/index.html in order to add plots to http://localhost:8000. The page already includes JavaScript libraries that will automate the process of: Inserting a plot at a specified location in the page; Downloading the streaming data from the web server; Updating the plot with new data every few seconds. Your first chart will be displayed in the <canvas id="first-chart"></canvas> element that is already present (and invisible for now) on the page. You can add several <canvas> elements to plot different topics. The id attribute must be unique in the page! To create the plot, look for the // Your code here comment and use the following JavaScript function (where topic and variable are taken from the map): creategraph(element_id, topic, variable); After saving the template file and reloading the page, you should see an empty plot for a couple seconds, then the actual plot with new data coming every 5 seconds. Transform the data streams For now, all your web server does is consume a stream from OOI and forward the raw data to clients. However, you can leverage all the power of Python to apply transformations to the streams. To apply transformation to streams, you can provide as many handler functions as you wish to the OOIClient. The client will call these functions and pass them a List of data values every time new data is received from the OOI streams. The handler function must inspect the List of values and produce a new List object containing the transformed values. The code in dashboard/config.py demonstrates a simple function that squares every value in a topic (see function square ). 4

Using the following code, try and make a new handler function that computes the moving average of any topic and variable. Each point of a simple moving average is the average of the previous n points. This code computes a centered moving average, which is slightly more complex: def slidingaverage(topic, startindex): windowsize = 50 half = int(windowsize / 2) data = OOIClient.data[topic] if not 'bottom_pressure' in data[0]: return [] comp = OOIClient.data[topic + '-avg'] res = [] n = len(data) m = len(comp) if n < windowsize: return [] start = max(m, half) end = n - half for i in range(start, end): winstart = i - half winend = i + half sum = 0 for i in range(winstart, winend): sum += data[i]['bottom_pressure'] avg = sum / windowsize res.append({ 'time': data[i]['time'], 'bottom_pressure': avg }) return res Don t hesitate and try other transformations! Here are some more ideas: Convert temperatures from Celcius to Farenheit Remove outliers 5

6