Digital Democracy Video Manager

Similar documents
ECE 30 Introduction to Computer Engineering

The webmethods CloudStreams Amazon S3 Connector

Leveraging the Security of AWS's Own APIs for Your App. Brian Wagner Solutions Architect Serverless Web Day June 23, 2016

PubWC Bathroom Review App By Clay Jacobs Advisor: Alexander Dekhtyar Computer Science Department California Polytechnic State University 2017

LECTURE 15. Web Servers

Career Path Web Application. CSC 492: Senior Project II Student: Annamarie Roger Major of Study: Computer Science

Amazon Web Services Monitoring Integration User Guide

DIGIT.B4 Big Data PoC

SUSE Enterprise Storage Deployment Guide for Veritas NetBackup Using S3

IEMS 5722 Mobile Network Programming and Distributed Server Architecture Semester 2

Task 2: TCP Communication

Python web frameworks

Qualys Release Notes

Lab 2 Third Party API Integration, Cloud Deployment & Benchmarking

django- -gateway Documentation

flask-dynamo Documentation

Write On Aws. Aws Tools For Windows Powershell User Guide using the aws tools for windows powershell (p. 19) this section includes information about

Deploying a distributed application with OpenStack

How to Route Internet Traffic between A Mobile Application and IoT Device?

Microservices without the Servers: AWS Lambda in Action

Introduction to data centers

Better, Faster, Stronger web apps with Amazon Web Services. Senior Technology Evangelist, Amazon Web Services

Tiger Bridge 1.0 Administration Guide

AWS plug-in. Qlik Sense 3.0 Copyright QlikTech International AB. All rights reserved.

EDB Ark. Administrative User s Guide. Version 2.2

Quick start guide for Infscape UrBackup Appliance on Amazon Web Services

Naseem S, Fares Sh, Milad B, Samih T

PROGRAMMING GOOGLE APP ENGINE WITH PYTHON: BUILD AND RUN SCALABLE PYTHON APPS ON GOOGLE'S INFRASTRUCTURE BY DAN SANDERSON

/ Cloud Computing. Recitation 5 September 26 th, 2017

AWS 101. Patrick Pierson, IonChannel

Commvault Backup to Cloudian Hyperstore CONFIGURATION GUIDE TO USE HYPERSTORE AS A STORAGE LIBRARY

Fal Spring C a l i f o r n i a P o l y t e c h n i c S t a t e U n i v e r s i t y

EDB Ark Administrative User s Guide. Version 2.1

Cloud Computing Architecture

Starting the Avalanche:

At Course Completion Prepares you as per certification requirements for AWS Developer Associate.

AWS Lambda + nodejs Hands-On Training

Azure Marketplace Getting Started Tutorial. Community Edition

Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015

IoT Device Simulator

The LGI Pilot job portal. EGI Technical Forum 20 September 2011 Jan Just Keijser Willem van Engen Mark Somers

San Jose State University College of Science Department of Computer Science CS185C, Introduction to NoSQL databases, Spring 2017

Level Up Your CF Apps with Amazon Web Services

Immersion Day. Getting Started with AWS Lambda. August Rev

NGF0502 AWS Student Slides

Azure Marketplace. Getting Started Tutorial. Community Edition

How Cisco IT Improves Commerce User Experience by Securely Sharing Internal Business Services with Partners

Life Without DevStack: OpenStack Development With OSA. Miguel

Thomson Reuters Graph Feed & Amazon Neptune

DataMan. version 6.5.4

MobileTurk Report CIS : CIS Senior Design Zach Krasner, Kate Miller, Alex Whitaker

Advanced Usage of the AWS CLI. Brian Wagner, Solutions Architect

Alteryx Technical Overview

Hadoop Integration User Guide. Functional Area: Hadoop Integration. Geneos Release: v4.9. Document Version: v1.0.0

Testing in AWS. Let s go back to the lambda function(sample-hello) you made before. - AWS Lambda - Select Simple-Hello

AWS SDK for Node.js. Getting Started Guide Version pre.1 (developer preview)

Cache Memory Mapping Techniques. Continue to read pp

CloudBATCH: A Batch Job Queuing System on Clouds with Hadoop and HBase. Chen Zhang Hans De Sterck University of Waterloo

Video on Demand on AWS

Cloud Computing 3. CSCI 4850/5850 High-Performance Computing Spring 2018

Serverless Computing. Redefining the Cloud. Roger S. Barga, Ph.D. General Manager Amazon Web Services

Sport performance analysis Project Report

Dynamic Routing and Network Monitoring for the Polywog Protocol

Exploiting Cloud Storage With DB2 for LUW

Serverless Architecture Hochskalierbare Anwendungen ohne Server. Sascha Möllering, Solutions Architect

/ Cloud Computing. Recitation 5 February 14th, 2017

Trending with Purpose. Jason Dixon

SOA & REST. Ola Angelsmark

Democratized Performance Test Platform. Open source, enterprise ready modular platform, that is tool chain friendly.

Embedded Technosolutions

Homework 9: Stock Search Android App with Facebook Post A Mobile Phone Exercise

University of Waterloo Midterm Examination Sample Solution

STORAGE MADE EASY: S3 DRIVE & S3 EXPLORER

Mission Guide: Amazon S3

Course Outline. Introduction to Azure for Developers Course 10978A: 5 days Instructor Led

I hate money. Release 1.0

Fundamentals of Python: First Programs. Chapter 4: Text Files

Oracle Transparent Gateways

Tableau Server Platform Monitoring

About the Tutorial. Audience. Prerequisites. Disclaimer & Copyright. TurboGears

BriCS. University of Bristol Cloud Service Simulation Runner. User & Developer Guide. 1 October John Cartlidge & M.

Administrator Guide Administrator Guide

Test On Line: reusing SAS code in WEB applications Author: Carlo Ramella TXT e-solutions

AWS Lambda: Event-driven Code in the Cloud

Adobe ColdFusion 11 Enterprise Edition

Zombie Apocalypse Workshop

SwiftStack Object Storage

Configuring AWS for Zerto Virtual Replication

Coveo Platform 7.0. Atlassian Confluence V2 Connector Guide

USING VERITAS ENTERPRISE VAULT WITH DELL EMC ELASTIC CLOUD STORAGE

The webmethods CloudStreams Amazon EC2 Connector

San Jose State University College of Science Department of Computer Science CS185C, NoSQL Database Systems, Section 1, Spring 2018

Luigi Build Data Pipelines of batch jobs. - Pramod Toraskar

Port Utilization in SocialMiner

SAS Viya : The Beauty of REST in Action

City.Risks SDK. Deliverable D3.6. Editor Nikos Bakalos Dimitris Zografos (ICCS) Contributors N. Papadakis, A. Litke, A. Anagnostopoulos (INFT)

Cloud Computing /AWS Course Content

Com S 321 Problem Set 3

Jackalope Documentation

Testbed-12 TEAM Engine Virtualization User Guide

Transcription:

Digital Democracy Video Manager by Scott Lam Computer Science Department College of Engineering California Polytechnic State University 2016 Date Submitted: March 18, 2016 Advisor: Alex Dekhtyar

Table of Contents Table of Figures... ii Abstract... iii Background... 1 Use Case... 2 Design... 3 Original Design... 3 After Further Research... 3 Implementation... 4 Technology & Frameworks... 4 File Hierarchy & File Descriptions... 4 Environment Variables... 5 API... 5 Cache... 5 Setup & Testing... 6 Setup... 6 Testing... 6 Conclusion... 7 i

Table of Figures Figure 1. Use Case Example Showing Video Manager Integration With Video Cutting Feature. 2 Figure 2. Digital Democracy Video Manager Design... 3 Figure 3. Video Manager File Hierarchy... 4 ii

Abstract The purpose of this senior project was to design and implement a video manager that provides scalability by caching videos. Currently, there are about 3600 videos associated with the Digital Democracy project. These videos are only for the California legislative system and as the project expands to multiple states the amount of videos will rapidly increase. By creating this video manager, disk space on the Digital Democracy server will be saved. All videos can be hosted on the cloud storage service and when operations need to be performed on those videos, they will be downloaded from the cloud. iii

Background Digital Democracy is a project created by The Institute for Advanced Technology and Public Policy (IATPP) at California Polytechnic State University, San Luis Obispo. Digital Democracy delivers its database of California legislative hearings and transcriptions through its website http://www.digitaldemocracy.org. The goal of the project is to make legislative data, normally difficult to find or not available to the public at all, available to the public in an easily searchable format. IATPP seeks to provide more features in its product, for example, the ability for Digital Democracy users to cut videos and stitch the videos together to make a collage. 1

Use Case The figure below illustrates how the video manager would interact with one of Digital Democracy s features video cutting/collaging. First, the user requests that a video be cut or multiple videos be concatenated from the Digital Democracy website. Next, that video is sent to the video cutting tool. In order for operations to be performed on videos, they need to be available locally. This is where the video manager comes in. The video cutting tool sends a request to the video manager for the video(s) and the video manager returns the path to those video(s). The video cutting tool then performs operations on those video(s) and the result is passed back to the user. Figure 1. Use Case Example Showing Video Manager Integration With Video Cutting Feature. 2

Design The design of the video manager is shown below in Figure 1. An API provides a high-level interface to interact with the cache. If there is a hit, the path to the video is returned. Otherwise, the video is downloaded from the cloud storage service and the path to the video is returned to the API. Figure 2. Digital Democracy Video Manager Design Original Design Originally, the video manager was designed as a daemon listening to a socket using Python s python-daemon and socket library. The cache would be a daemon listening to a socket and the API would communicate with the cache through that socket. After Further Research A better solution was found that would provide a higher level framework for achieving the same functionality and would be faster and easier to set up. This solution was to implement a HTTP server that would receive requests from the API. The API allows users to retrieve videos without having to know how it works. 3

Implementation This section will describe, in detail, how the video manager was actually set up after the design was established. Technology & Frameworks Amazon Web Services - EC2, S3 Python 2.7 Flask, WSGI Apache File Hierarchy & File Descriptions The figure below illustrates the structure of the video manager. Descriptions of the purpose of each file are also explained. videomanager cache cache.py cache.yaml init.py server.py environment.py exceptions.py init.py request.py Figure 3. Video Manager File Hierarchy environment.py Contains functions that retrieve environment variables. exceptions.py Contains classes for exceptions thrown. request.py Contains API function calls. cache.py Creates the Flask application and handles HTTP requests. server.py Contains all server-related functions that are called by the cache. cache.yaml Settings file for the cache. 4

Environment Variables API DD_CACHE_PATH The path to the cache directory where videos are stored. DD_CACHE_HOST The host where the cache is located. DD_CACHE_PORT The port that the cache will be using. AWS_ACCESS_KEY_ID The access key for Amazon S3. AWS_SECRET_ACCESS_KEY The secret key for Amazon S3. AWS_BUCKET Name for the bucket where videos are stored on Amazon S3. The API sends a GET request to the HTTP server which is acting as the cache. Cache The cache implements the Least Recently Used (LRU) algorithm for keeping videos. It checks each video s modification time to determine the least recently used video. When a video is requested the cache: 1. Checks if the video is in the cache a. If it is, update the modification time and return its path; otherwise, continue. 2. Creates a connection to S3 and get the bucket that contains all videos. 3. Gets the threshold (cache size) from the settings file. 4. Calculates how much space will be occupied after the video is downloaded. 5. If the new size is greater than the threshold, keeps removing the least recently used video until there is enough space to accommodate the new video. 6. Downloads the video from Amazon S3. 5

Setup & Testing Setup The HTTP server was set up using Python s Web Server Gateway Interface (WSGI) as an interface to the Apache web server. Specifically, Apache s mod_wsgi module was used to set up the video manager. An entry in Apache s virtual-hosts file was created to register the host name and port to the video manager directory. A.wsgi file was created to be invoked by mod_wsgi. Testing The video manager was tested using the following test cases: Downloading a video larger than the threshold when the cache was empty. Downloading a video larger than the threshold when the cache was occupied. Requesting a video that did not exist in Amazon S3. The video existed in the cache (hit). The video did not exist in the cache, but existed in Amazon S3 (miss). 6

Conclusion This system, the Digital Democracy Video Manager, was able to allow scalability regarding Digital Democracy s video storage by keeping all videos on the cloud storage service- Amazon S3 instead of on the server. Those videos stay on Amazon S3 until operations are to be performed on one of those videos. The cache uses the Least Recently Used algorithm to determine which video will be replaced when another video is requested. One important aspect that wasn t able to be done during this project was gathering statistics since Digital Democracy s video manipulation features have not been deployed yet. However, in the future statistics such as cache hits and misses can be recorded and different cache algorithms can be implemented besides the Least Recently Used algorithm. By implementing different algorithms, the most efficient one can be found for the Digital Democracy Video Manager. 7