Fishing Activity Visualization with Free Software Bigdata Analytics Institute
|
|
- Lily Washington
- 5 years ago
- Views:
Transcription
1 Fishing Activity Visualization with Free Software Bigdata Analytics Institute Erico N de Souza, PhD erico.souza@dal.ca Souza, Latouf (Bigdata Inst.) Bigdata Institute 1 / 22
2 Introduction What would you do if you received a task to manage 6.5 Billion GPS points? How to store it? How to retrieve it? How to visualize it? How to use some Machine Learning (ML) on it? This presentation does not intend to teach audience on which tool is the best You should see this as a generic guideline about what should be used to tackle the problem All tools have their usefulness, but you have to think and test when you should use them... Souza, Latouf (Bigdata Inst.) Bigdata Institute 2 / 22
3 AIS Data Automatic Identification System (AIS): protocol for vessel position report Multiple types of messages, but 90% belong to messages types: 1,2 and 3 Messsage types 1,2 and 3 contain more than 20 fields Our experience indicated that we only used 6 of them Souza, Latouf (Bigdata Inst.) Bigdata Institute 3 / 22
4 Current Workflow... Database Storage Pre-process/Cleaning DBA ExactEarth Souza, Latouf (Bigdata Inst.) Bigdata Institute 4 / 22
5 Which tools are the best to data retrieval? It really depends on the problem... Relational DBs can be very efficient to recover very large datasets The main advantages: 1 It has mathematically sound unified language (SQL); 2 Normalization guarantees (if wisely used...); 3 Data integrity The main issues with Relational DBs: 1 Not easily configurable parallel servers; 2 Schemas are not easily adaptable (cannot add easily new fields in a table); 3 Not scalable when data reaches billions of instances... If you are clever, you may design an schema with summaries of portions of the data, which will avoid touching in each data point... It all depends on your client s needs... Souza, Latouf (Bigdata Inst.) Bigdata Institute 5 / 22
6 Which tools are the best to data retrieval?(cont.) The Non-SQL solutions are an option. Which one use? Main issues: 1 Lack of normalization (Big Table); 2 Lack of an unified search language; 3 Many tools do not have good documentation; 4 Storage does not guarantee data integrity... Main reasons to use: 1 Promise of faster searches; 2 Parallelism; 3 Flexibility to create new fields for the data stored; 4 Data replication... Different tools were tested to answer the question in the beginning... Souza, Latouf (Bigdata Inst.) Bigdata Institute 6 / 22
7 Non-SQL Tools Evaluated... Cassandra Distributed hierarchical database (the order you list the fields in creation time will indicate how the table will be partitioned) Language tries to resemble SQL syntax Queries have to include all fields when user applies an restriction or projection Low adaptability MongoDB Distributed document database (purely based on JSON) Implemented in C++ and uses its own protocol for data transfer It does not require the user to define an schema, and it is more flexible to execute queries Slower than other tools to retrieve statistics from the fields, and requires some effort to configure multiple nodes. High adaptability Souza, Latouf (Bigdata Inst.) Bigdata Institute 7 / 22
8 Non-SQL Tools Evaluated... Solr Search tool (purely based on JSON) Depends on Java API to provide results to queries (Solr is a web service). Cloud configuration and data distribution simple Query is executed using a http request to a Solr server. High adaptability The decision really was between MongoDB and Solr Solr was chosen because it is faster to summarize data and easier to configure in distributed mode. This does not mean you should chose Solr as your solution... Souza, Latouf (Bigdata Inst.) Bigdata Institute 8 / 22
9 Some Details about the indexes... Each index has around 500 Million points. This number was decided to limit the size of the index. Roughly, each index has 110Gb in size Not all AIS message is available for searching. Messages 1, 2 and 3 have more than 20 different fields. We have to choose which of them will be searchable Based on previous experiences, we took the decision to store in Solr only 6 fields, and the rest can be reached by request. Souza, Latouf (Bigdata Inst.) Bigdata Institute 9 / 22
10 Solr Organization Souza, Latouf (Bigdata Inst.) Bigdata Institute 10 / 22
11 Machine Learning (ML) on Large Datasets... Problem definition: Fishing activity detection Different types of vessels: Trawlers and Longliners Data labeled by a Biologist expert (16 Longliners and 8 Trawlers - close to 1 Million points) Which tool to use? In previous works, we have heavily invested in R as tool to deal with ML tasks It can be used to build initial models and testing, but not scalable... Python offers many ML libraries and fast access tools to Solr... Souza, Latouf (Bigdata Inst.) Bigdata Institute 11 / 22
12 ML on Large Datasets...(CONT.) An fast algorithm for trajectory segmentation was implemented, and extra features based on the segments were calculated The result of this pre-processing was given to Adaboost + Random Forest for training and prediction Only a portion the labeled data from Longliners was used for training (60%), and the algorithm is able to predict fishing of Trawlers and Longliners with 90% accuracy. Souza, Latouf (Bigdata Inst.) Bigdata Institute 12 / 22
13 Segmentation Results Souza, Latouf (Bigdata Inst.) Bigdata Institute 13 / 22
14 Visualization Web applications that make searching and plotting data easy Requirements: Data secured behind user login Access based on user role Map-based tools/widgets Yii2 PHP framework: MVC Architecture User role configuration Plugins and widgets Souza, Latouf (Bigdata Inst.) Bigdata Institute 14 / 22
15 Fishing Observer Python and Pysolr Display fishing activity, both by volume and individual points CesiumJS Geospatial 3D mapping platform for creating virtual globes based on JavaScript Industry standard vector formats, such as KML, GeoJSON, and TopoJSON, including terrain clamping. Timeline and animation widgets for controlling simulation time. Datashader Python library Simply adds a layer on top of Cesium Generates heatmaps using millions+ data points quickly Souza, Latouf (Bigdata Inst.) Bigdata Institute 15 / 22
16 Monthly Fishing Data is searchable by month Heatmaps for each week and full month Souza, Latouf (Bigdata Inst.) Bigdata Institute 16 / 22
17 Monthly Fishing Souza, Latouf (Bigdata Inst.) Bigdata Institute 17 / 22
18 Monthly Fishing Souza, Latouf (Bigdata Inst.) Bigdata Institute 18 / 22
19 Exporting Data CSV files stored by user name, accessible by search information In the future, we may store saved searches Souza, Latouf (Bigdata Inst.) Bigdata Institute 19 / 22
20 Conclusions & Future Directions For Data Management... Cloud systems seem to be the only scalable method to work with very large datasets Multiple choices, but what is the correct one? (For some problems, RDBs are still a competitive method...) Learning cost in both language and technology For Machine Learning... Python is at the moment the best scalable tool to deal with large datasets... Exploit usage of parallelism within python: Numba Souza, Latouf (Bigdata Inst.) Bigdata Institute 20 / 22
21 Conclusions & Future Directions For Visualization... As presented, we cannot show every data point: all development has to balance what will be presented with its meaning Build a data summary (in our case, a heatmap...) Tools improving fast: we used python tools to build the heatmap, but it can also be achieved with Javascript... Open Questions: How to integrate different types of data? (Raster or sound) Could the indexes be smaller? What types of patterns are of interest? Algorithm efficiency for large datasets... For next version: saving queries and maps, use of cloud environment with support from Compute Canada,... Souza, Latouf (Bigdata Inst.) Bigdata Institute 21 / 22
22 Tool s URL... Souza, Latouf (Bigdata Inst.) Bigdata Institute 22 / 22
Big Data Architect.
Big Data Architect www.austech.edu.au WHAT IS BIG DATA ARCHITECT? A big data architecture is designed to handle the ingestion, processing, and analysis of data that is too large or complex for traditional
More informationCIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )
Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL
More informationBIG DATA COURSE CONTENT
BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data
More informationBuilding A Billion Spatio-Temporal Object Search and Visualization Platform
2017 2 nd International Symposium on Spatiotemporal Computing Harvard University Building A Billion Spatio-Temporal Object Search and Visualization Platform Devika Kakkar, Benjamin Lewis Goal Develop a
More informationThe Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources
FOSS4G 2017 Boston The Billion Object Platform (BOP): a system to lower barriers to support big, streaming, spatio-temporal data sources Devika Kakkar and Ben Lewis Harvard Center for Geographic Analysis
More informationMySQL for Developers. Duration: 5 Days
Oracle University Contact Us: 0800 891 6502 MySQL for Developers Duration: 5 Days What you will learn This MySQL for Developers training teaches developers how to develop console and web applications using
More informationMySQL for Developers. Duration: 5 Days
Oracle University Contact Us: Local: 0845 777 7 711 Intl: +44 845 777 7 711 MySQL for Developers Duration: 5 Days What you will learn This MySQL for Developers training teaches developers how to develop
More informationCS639: Data Management for Data Science. Lecture 1: Intro to Data Science and Course Overview. Theodoros Rekatsinas
CS639: Data Management for Data Science Lecture 1: Intro to Data Science and Course Overview Theodoros Rekatsinas 1 2 Big science is data driven. 3 Increasingly many companies see themselves as data driven.
More informationVoltDB for Financial Services Technical Overview
VoltDB for Financial Services Technical Overview Financial services organizations have multiple masters: regulators, investors, customers, and internal business users. All create, monitor, and require
More informationBig Data Analytics using Apache Hadoop and Spark with Scala
Big Data Analytics using Apache Hadoop and Spark with Scala Training Highlights : 80% of the training is with Practical Demo (On Custom Cloudera and Ubuntu Machines) 20% Theory Portion will be important
More informationImplementing Web GIS Solutions
Implementing Web GIS Solutions using open source software Karsten Vennemann Seattle Talk Overview Talk Overview Why and What What is Open Source (GIS)? Why use it? Application Components Overview of Web
More informationNoSQL Databases An efficient way to store and query heterogeneous astronomical data in DACE. Nicolas Buchschacher - University of Geneva - ADASS 2018
NoSQL Databases An efficient way to store and query heterogeneous astronomical data in DACE DACE https://dace.unige.ch Data and Analysis Center for Exoplanets. Facility to store, exchange and analyse data
More informationGetting to know. by Michelle Darling August 2013
Getting to know by Michelle Darling mdarlingcmt@gmail.com August 2013 Agenda: What is Cassandra? Installation, CQL3 Data Modelling Summary Only 15 min to cover these, so please hold questions til the end,
More informationCloudExpo November 2017 Tomer Levi
CloudExpo November 2017 Tomer Levi About me Full Stack Engineer @ Intel s Advanced Analytics group. Artificial Intelligence unit at Intel. Responsible for (1) Radical improvement of critical processes
More informationChapter 24 NOSQL Databases and Big Data Storage Systems
Chapter 24 NOSQL Databases and Big Data Storage Systems - Large amounts of data such as social media, Web links, user profiles, marketing and sales, posts and tweets, road maps, spatial data, email - NOSQL
More informationSQL Azure. Abhay Parekh Microsoft Corporation
SQL Azure By Abhay Parekh Microsoft Corporation Leverage this Presented by : - Abhay S. Parekh MSP & MSP Voice Program Representative, Microsoft Corporation. Before i begin Demo Let s understand SQL Azure
More informationNoSQL Databases MongoDB vs Cassandra. Kenny Huynh, Andre Chik, Kevin Vu
NoSQL Databases MongoDB vs Cassandra Kenny Huynh, Andre Chik, Kevin Vu Introduction - Relational database model - Concept developed in 1970 - Inefficient - NoSQL - Concept introduced in 1980 - Related
More informationGetting Started with the ArcGIS API for JavaScript. Julie Powell, Paul Hann
Getting Started with the ArcGIS API for JavaScript Julie Powell, Paul Hann Esri Developer Summit Berlin November 19 2012 Getting Started with the ArcGIS API for JavaScript ArcGIS for Server Is a Platform
More informationA Backend for Sensor Data
A Backend for Sensor Data Overview Hardware Sensors, Arduino & Raspberry Pi integration Software Sending data around Storing data Access, analysis and visualization Hardware Dust MQ2 MQ135 DHT11 Temp.
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 1, 2017 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 12 (Wrap-up) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2457
More informationDocument Sub Title. Yotpo. Technical Overview 07/18/ Yotpo
Document Sub Title Yotpo Technical Overview 07/18/2016 2015 Yotpo Contents Introduction... 3 Yotpo Architecture... 4 Yotpo Back Office (or B2B)... 4 Yotpo On-Site Presence... 4 Technologies... 5 Real-Time
More informationEuropeana Core Service Platform
Europeana Core Service Platform DELIVERABLE D7.1: Strategic Development Plan, Architectural Planning Revision Final Date of submission 30 October 2015 Author(s) Marcin Werla, PSNC Pavel Kats, Europeana
More informationITM DEVELOPMENT (ITMD)
ITM Development (ITMD) 1 ITM DEVELOPMENT (ITMD) ITMD 361 Fundamentals of Web Development This course will cover the creation of Web pages and sites using HTML, CSS, Javascript, jquery, and graphical applications
More informationBrad Dayley. Sams Teach Yourself. NoSQL with MongoDB. SAMS 800 East 96th Street, Indianapolis, Indiana, USA
Brad Dayley Sams Teach Yourself NoSQL with MongoDB SAMS 800 East 96th Street, Indianapolis, Indiana, 46240 USA Table of Contents Introduction 1 How This Book Is Organized 1 Code Examples 2 Special Elements
More information10/18/2017. Announcements. NoSQL Motivation. NoSQL. Serverless Architecture. What is the Problem? Database Systems CSE 414
Announcements Database Systems CSE 414 Lecture 11: NoSQL & JSON (mostly not in textbook only Ch 11.1) HW5 will be posted on Friday and due on Nov. 14, 11pm [No Web Quiz 5] Today s lecture: NoSQL & JSON
More informationComparing SQL and NOSQL databases
COSC 6397 Big Data Analytics Data Formats (II) HBase Edgar Gabriel Spring 2014 Comparing SQL and NOSQL databases Types Development History Data Storage Model SQL One type (SQL database) with minor variations
More informationEECS 282 Information Systems Design and Programming. Atul Prakash Professor, Computer Science and Engineering University of Michigan
EECS 282 Information Systems Design and Programming Atul Prakash Professor, Computer Science and Engineering University of Michigan 1 What is the Course About? A second programming course - but different
More informationMarkLogic 8 Overview of Key Features COPYRIGHT 2014 MARKLOGIC CORPORATION. ALL RIGHTS RESERVED.
MarkLogic 8 Overview of Key Features Enterprise NoSQL Database Platform Flexible Data Model Store and manage JSON, XML, RDF, and Geospatial data with a documentcentric, schemaagnostic database Search and
More informationInformation Workbench
Information Workbench The Optique Technical Solution Christoph Pinkel, fluid Operations AG Optique: What is it, really? 3 Optique: End-user Access to Big Data 4 Optique: Scalable Access to Big Data 5 The
More informationUber Push and Subscribe Database
Uber Push and Subscribe Database June 21, 2016 Clifford Boyce Kyle DiSandro Richard Komarovskiy Austin Schussler Table of Contents 1. Introduction 2 a. Client Description 2 b. Product Vision 2 2. Requirements
More informationUSING THE MUSICBRAINZ DATABASE IN THE CLASSROOM. Cédric Mesnage Southampton Solent University United Kingdom
USING THE MUSICBRAINZ DATABASE IN THE CLASSROOM Cédric Mesnage Southampton Solent University United Kingdom Abstract Musicbrainz is a crowd-sourced database of music metadata. The level 6 class of Data
More informationEPL660: Information Retrieval and Search Engines Lab 3
EPL660: Information Retrieval and Search Engines Lab 3 Παύλος Αντωνίου Γραφείο: B109, ΘΕΕ01 University of Cyprus Department of Computer Science Apache Solr Popular, fast, open-source search platform built
More informationInvestigating Source Code Reusability for Android and Blackberry Applications
Investigating Source Code Reusability for Android and Blackberry Applications Group G8 Jenelle Chen Aaron Jin 1 Outline Recaps Challenges with mobile development Problem definition Approach Demo Detailed
More information17/05/2017. What we ll cover. Who is Greg? Why PaaS and SaaS? What we re not discussing: IaaS
What are all those Azure* and Power* services and why do I want them? Dr Greg Low SQL Down Under greg@sqldownunder.com Who is Greg? CEO and Principal Mentor at SDU Data Platform MVP Microsoft Regional
More informationCISC 7610 Lecture 5 Distributed multimedia databases. Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL
CISC 7610 Lecture 5 Distributed multimedia databases Topics: Scaling up vs out Replication Partitioning CAP Theorem NoSQL NewSQL Motivation YouTube receives 400 hours of video per minute That is 200M hours
More informationdan.fay@microsoft.com Scientific Data Intensive Computing Workshop 2004 Visualizing and Experiencing E 3 Data + Information: Provide a unique experience to reduce time to insight and knowledge through
More informationAccessing other data fdw, dblink, pglogical, plproxy,...
Accessing other data fdw, dblink, pglogical, plproxy,... Hannu Krosing, Quito 2017.12.01 1 Arctic Circle 2 Who am I Coming from Estonia PostgreSQL user since about 1990 (when it was just Postgres 4.2)
More informationEvaluation of Machine Learning Algorithms for Satellite Operations Support
Evaluation of Machine Learning Algorithms for Satellite Operations Support Julian Spencer-Jones, Spacecraft Engineer Telenor Satellite AS Greg Adamski, Member of Technical Staff L3 Technologies Telemetry
More informationRethinkDB. Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu
RethinkDB Niharika Vithala, Deepan Sekar, Aidan Pace, and Chang Xu Content Introduction System Features Data Model ReQL Applications Introduction Niharika Vithala What is a NoSQL Database Databases that
More informationOracle Service Cloud Integration for Developers Ed 1
Oracle University Contact Us: Local: 1800 103 4775 Intl: +91 80 67863102 Oracle Service Cloud Integration for Developers Ed 1 Duration: 5 Days What you will learn The class covers how to extend the Service
More informationSpatial Databases - a look into the future
Spatial Databases - a look into the future Mario Miler and Damir Medak Faculty of Geodesy University of Zagreb November 26, 2010 Contents Introduction Spatial databases geography data type curves raster
More informationCapabilities of Cloudant NoSQL Database IBM Corporation
Capabilities of Cloudant NoSQL Database After you complete this section, you should understand: The features of the Cloudant NoSQL Database: HTTP RESTfulAPI Secondary indexes and MapReduce Cloudant Query
More informationFAQs. Business (CIP 2.2) AWS Market Place Troubleshooting and FAQ Guide
FAQs 1. What is the browser compatibility for logging into the TCS Connected Intelligence Data Lake for Business Portal? Please check whether you are using Mozilla Firefox 18 or above and Google Chrome
More informationData Movement & Tiering with DMF 7
Data Movement & Tiering with DMF 7 Kirill Malkin Director of Engineering April 2019 Why Move or Tier Data? We wish we could keep everything in DRAM, but It s volatile It s expensive Data in Memory 2 Why
More informationA c t i v e w o r k s p a c e f o r e x t e r n a l d a t a a g g r e g a t i o n a n d S e a r c h. 1
A c t i v e w o r k s p a c e f o r e x t e r n a l d a t a a g g r e g a t i o n a n d S e a r c h B a l a K a n t h i www.intelizign.com 1 Active workspace can search and visualize PLM data better! Problems:
More informationWorking with Reports
The following topics describe how to work with reports in the Firepower System: Introduction to Reports, page 1 Risk Reports, page 1 Standard Reports, page 2 About Working with Generated Reports, page
More informationMega-scale Postgres How to run 1,000,000 Postgres Databases
Mega-scale Postgres How to run 1,000,000 Postgres Databases Program What is Heroku & Heroku Postgres? Organizing principles for mega-scale operations Heroku Postgres Code deployment is good, but what
More informationUsing Data Science to deliver Workforce & Labour Market Insights. Gary Gan Co-Founder, JobKred
Using Data Science to deliver Workforce & Labour Market Insights Gary Gan Co-Founder, JobKred Collection of Data Online Sources Skills, Education, Experience AI-powered Career Development Platform Cloud-based
More informationNew Oracle NoSQL Database APIs that Speed Insertion and Retrieval
New Oracle NoSQL Database APIs that Speed Insertion and Retrieval O R A C L E W H I T E P A P E R F E B R U A R Y 2 0 1 6 1 NEW ORACLE NoSQL DATABASE APIs that SPEED INSERTION AND RETRIEVAL Introduction
More informationHawaii Energy and Environmental Technologies (HEET) Initiative
Hawaii Energy and Environmental Technologies (HEET) Initiative Office of Naval Research Grant Award Number N0014-11-1-0391 Task 8. ENERGY-NEUTRAL ENERGY TEST PLATFORMS 8.3 Advanced Database Research, Development
More informationCOMP9321 Web Application Engineering
COMP9321 Web Application Engineering Semester 2, 2015 Dr. Amin Beheshti Service Oriented Computing Group, CSE, UNSW Australia Week 12 (Wrap-up) http://webapps.cse.unsw.edu.au/webcms2/course/index.php?cid=2411
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationPartner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g
Partner Presentation Faster and Smarter Data Warehouses with Oracle OLAP 11g Vlamis Software Solutions, Inc. Founded in 1992 in Kansas City, Missouri Oracle Partner and reseller since 1995 Specializes
More informationDEMYSTIFYING BIG DATA WITH RIAK USE CASES. Martin Schneider Basho Technologies!
DEMYSTIFYING BIG DATA WITH RIAK USE CASES Martin Schneider Basho Technologies! Agenda Defining Big Data in Regards to Riak A Series of Trade-Offs Use Cases Q & A About Basho & Riak Basho Technologies is
More informationUsing Machine Learning to Identify Security Issues in Open-Source Libraries. Asankhaya Sharma Yaqin Zhou SourceClear
Using Machine Learning to Identify Security Issues in Open-Source Libraries Asankhaya Sharma Yaqin Zhou SourceClear Outline - Overview of problem space Unidentified security issues How Machine Learning
More informationChronix A fast and efficient time series storage based on Apache Solr. Caution: Contains technical content.
Chronix A fast and efficient time series storage based on Apache Solr Caution: Contains technical content. 68.000.000.000* time correlated data objects. How to store such amount of data on your laptop
More informationBuilding High Performance Apps using NoSQL. Swami Sivasubramanian General Manager, AWS NoSQL
Building High Performance Apps using NoSQL Swami Sivasubramanian General Manager, AWS NoSQL Building high performance apps There is a lot to building high performance apps Scalability Performance at high
More informationGetting Started with ArcGIS for Server. Charmel Menzel and Ken Gorton
Getting Started with ArcGIS for Server Charmel Menzel and Ken Gorton Agenda What is ArcGIS for Server? Types of Web services Publishing resources onto the Web Clients to ArcGIS for Server Editions and
More informationHow Insurers are Realising the Promise of Big Data
How Insurers are Realising the Promise of Big Data Jason Hunter, CTO Asia-Pacific, MarkLogic A Big Data Challenge: Pushing the Limits of What's Possible The Art of the Possible Multiple Government Agencies
More informationOracle Service Cloud Integration for Develope
Oracle Uni Contact Us: 08 Oracle Service Cloud Integration for Develope Durat5 Da What you will learn The class covers how to extend the Service Cloud objec applicable to all APIs before moving on to specific
More informationWelcome to the Era of Open Analytics
Welcome to the Era of Open Analytics Sumit Sarkar, Chief Data Evangelist, Progress sumit.sarkar@progress.com @SAsInSumit www.linkedin.com/in/meetsumit The embedded analytics market is estimated to nearly
More informationA Tutorial on Apache Spark
A Tutorial on Apache Spark A Practical Perspective By Harold Mitchell The Goal Learning Outcomes The Goal Learning Outcomes NOTE: The setup, installation, and examples assume Windows user Learn the following:
More informationCloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List)
CloudSwyft Learning-as-a-Service Course Catalog 2018 (Individual LaaS Course Catalog List) Microsoft Solution Latest Sl Area Refresh No. Course ID Run ID Course Name Mapping Date 1 AZURE202x 2 Microsoft
More informationICOM 5016 Database Systems. Database Users. User Interfaces and Tools. Chapter 8: Application Design and Development.
Chapter 8: Application Design and Development ICOM 5016 Database Systems Web Application Amir H. Chinaei Department of Electrical and Computer Engineering University of Puerto Rico, Mayagüez User Interfaces
More informationMongoDB Web Architecture
MongoDB Web Architecture MongoDB MongoDB is an open-source, NoSQL database that uses a JSON-like (BSON) document-oriented model. Data is stored in collections (rather than tables). - Uses dynamic schemas
More informationPerformance of popular open source databases for HEP related computing problems
Journal of Physics: Conference Series OPEN ACCESS Performance of popular open source databases for HEP related computing problems To cite this article: D Kovalskyi et al 2014 J. Phys.: Conf. Ser. 513 042027
More informationEECS 282 Information Systems Design and Programming. Atul Prakash Professor, Computer Science and Engineering University of Michigan
EECS 282 Information Systems Design and Programming Atul Prakash Professor, Computer Science and Engineering University of Michigan 1 What is the Course About? A second programming course - but different
More information_ LUCIADRIA V PRODUCT DATA SHEET _ LUCIADRIA PRODUCT DATA SHEET
_ LUCIADRIA PRODUCT DATA SHEET V2016 LuciadRIA offers browser-based geospatial situational awareness with the fluidity and speed of a desktop application. The software components of LuciadRIA have been
More informationReal-Time GIS: GeoEvent Extension
Real-Time GIS: GeoEvent Extension Greg Tieman gtieman@esri.com RJ Sunderman rsunderman@esri.com What is Real-Time GIS? GIS Data What has happened, what is happening, what will happen Credit: istockphoto/chris_lemmens
More informationCIB Session 12th NoSQL Databases Structures
CIB Session 12th NoSQL Databases Structures By: Shahab Safaee & Morteza Zahedi Software Engineering PhD Email: safaee.shx@gmail.com, morteza.zahedi.a@gmail.com cibtrc.ir cibtrc cibtrc 2 Agenda What is
More informationBROWSER. LuciadRIA DATA SHEET
BROWSER LuciadRIA DATA SHEET V2017 V2017.0 DATA SHEET LuciadRIA is the answer to today s demands for powerful, lightweight applications in the browser. Driven by today s most advanced web technologies,
More informationThe following topics describe how to work with reports in the Firepower System:
The following topics describe how to work with reports in the Firepower System: Introduction to Reports Introduction to Reports, on page 1 Risk Reports, on page 1 Standard Reports, on page 2 About Working
More informationDocument Object Storage with MongoDB
Document Object Storage with MongoDB Lecture BigData Analytics Julian M. Kunkel julian.kunkel@googlemail.com University of Hamburg / German Climate Computing Center (DKRZ) 2017-12-15 Disclaimer: Big Data
More informationTHE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES
1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB
More informationMigrate from Netezza Workload Migration
Migrate from Netezza Automated Big Data Open Netezza Source Workload Migration CASE SOLUTION STUDY BRIEF Automated Netezza Workload Migration To achieve greater scalability and tighter integration with
More informationMasters in Web Development
Masters in Web Development Accelerate your carrer by learning Web Development from Industry Experts. www.techgrad.in India s Leading Digital marketing Institute India s Leading Accademy 12,234+ Trainees
More informationOracle Service Cloud Integration for Developers Ed 1
Oracle University Contact Us: Local: 0845 777 7 711 Intl: +44 845 777 7 711 Oracle Service Cloud Integration for Developers Ed 1 Duration: 5 Days What you will learn The class covers how to extend the
More informationHomework: Building an Apache-Solr based Search Engine for DARPA XDATA Employment Data Due: November 10 th, 12pm PT
Homework: Building an Apache-Solr based Search Engine for DARPA XDATA Employment Data Due: November 10 th, 12pm PT 1. Overview This assignment picks up where the last one left off. You will take your JSON
More informationData Science Training
Data Science Training R, Predictive Modeling, Machine Learning, Python, Bigdata & Spark 9886760678 Introduction: This is a comprehensive course which builds on the knowledge and experience a business analyst
More informationOctober Oracle Application Express Statement of Direction
October 2017 Oracle Application Express Statement of Direction Disclaimer This document in any form, software or printed matter, contains proprietary information that is the exclusive property of Oracle.
More informationRavenDB & document stores
université libre de bruxelles INFO-H415 - Advanced Databases RavenDB & document stores Authors: Yasin Arslan Jacky Trinh Professor: Esteban Zimányi Contents 1 Introduction 3 1.1 Présentation...................................
More informationUCT Application Development Lifecycle. UCT Business Applications
UCT Business Applications Page i Table of Contents Planning Phase... 1 Analysis Phase... 2 Design Phase... 3 Implementation Phase... 4 Software Development... 4 Product Testing... 5 Product Implementation...
More informationLesson 12: ArcGIS Server Capabilities
GEOG 482 / 582 : GIS Data Management Lesson 12: ArcGIS Server Capabilities Overview Learning Objective Questions: 1. What are the ArcGIS Server Services? 2. How is ArcGIS Server packaged? 3. What are three
More informationESRI Technology Update. Joe Holubar Larry Young
ESRI Technology Update Joe Holubar Larry Young Continued Improvement Improving Quality and Extending and Refining Functionality First Half of 2009: Minor Update Release (ArcGIS 9.3.1) ArcGIS Explorer Fall
More information/ Cloud Computing. Recitation 8 October 18, 2016
15-319 / 15-619 Cloud Computing Recitation 8 October 18, 2016 1 Overview Administrative issues Office Hours, Piazza guidelines Last week s reflection Project 3.2, OLI Unit 3, Module 13, Quiz 6 This week
More informationData in the Cloud and Analytics in the Lake
Data in the Cloud and Analytics in the Lake Introduction Working in Analytics for over 5 years Part the digital team at BNZ for 3 years Based in the Auckland office Preferred Languages SQL Python (PySpark)
More informationAugust 23, 2017 Revision 0.3. Building IoT Applications with GridDB
August 23, 2017 Revision 0.3 Building IoT Applications with GridDB Table of Contents Executive Summary... 2 Introduction... 2 Components of an IoT Application... 2 IoT Models... 3 Edge Computing... 4 Gateway
More informationArcGIS GeoEvent Processor for Server. Jay Hagen Esri Solution Engineer
ArcGIS GeoEvent Processor for Server Jay Hagen Esri Solution Engineer GeoEvent Extension Jay Hagen Esri Solution Engineer GeoEvent Extension Real-Time GIS Overview Working with Real-Time Data Performing
More informationThe NoSQL movement. CouchDB as an example
The NoSQL movement CouchDB as an example About me sleepnova - I'm a freelancer Interests: emerging technology, digital art web, embedded system, javascript, programming language Some of my works: Chrome
More informationIBM Data Science Experience White paper. SparkR. Transforming R into a tool for big data analytics
IBM Data Science Experience White paper R Transforming R into a tool for big data analytics 2 R Executive summary This white paper introduces R, a package for the R statistical programming language that
More informationA Non-Relational Storage Analysis
A Non-Relational Storage Analysis Cassandra & Couchbase Alexandre Fonseca, Anh Thu Vu, Peter Grman Cloud Computing - 2nd semester 2012/2013 Universitat Politècnica de Catalunya Microblogging - big data?
More informationPSICon Daniel G. A. Smith The Molecular Sciences Software molssi.org
PSICon 2018 Daniel G. A. Smith The Molecular Sciences Software Institute @dga_smith molssi.org MolSSI Education Initiatives How do we change the software practices of an entire field? Primary objectives:
More informationIntroduction to ArcGIS Server 10.1
Introduction to ArcGIS Server 10.1 E-Learning for the GIS Professional Any Time, Any Place! geospatialtraining.com Module Outline What is ArcGIS Server? GIS Resources and Services ArcGIS Server Components
More informationPhp And Mysql Manual Simple Yet Powerful Web Programming
Php And Mysql Manual Simple Yet Powerful Web Programming It allows you to create anything from a simpledownload EBOOK. Beginning PHP 6, Apache, MySQL 6 Web Development Free Ebook Offering a gentle learning
More informationMySQL for Database Administrators Ed 3.1
Oracle University Contact Us: 1.800.529.0165 MySQL for Database Administrators Ed 3.1 Duration: 5 Days What you will learn The MySQL for Database Administrators training is designed for DBAs and other
More informationDatabase Management Systems MIT Introduction By S. Sabraz Nawaz
Database Management Systems MIT 22033 Introduction By S. Sabraz Nawaz Recommended Reading Database Management Systems 3 rd Edition, Ramakrishnan, Gehrke Murach s SQL Server 2008 for Developers Any book
More informationA data-driven framework for archiving and exploring social media data
A data-driven framework for archiving and exploring social media data Qunying Huang and Chen Xu Yongqi An, 20599957 Oct 18, 2016 Introduction Social media applications are widely deployed in various platforms
More informationDatabase Developers Forum APEX
Database Developers Forum APEX 20.05.2014 Antonio Romero Marin, Aurelien Fernandes, Jose Rolland Lopez De Coca, Nikolay Tsvetkov, Zereyakob Makonnen, Zory Zaharieva BE-CO Contents Introduction to the Controls
More informationArcGIS for Intelligence: Discern Activities of Interest Through Advanced Analysis. Natalie Feuerstein Ben Conklin Lyle Wright
ArcGIS for Intelligence: Discern Activities of Interest Through Advanced Analysis Natalie Feuerstein Ben Conklin Lyle Wright Challenges Demo Movement Pattern Dashboard Key Concepts New Analytic Workflow
More informationAnalyzing Big Data with Microsoft R
Analyzing Big Data with Microsoft R 20773; 3 days, Instructor-led Course Description The main purpose of the course is to give students the ability to use Microsoft R Server to create and run an analysis
More information