Preparing Digital Collections for Big Data Analysis. Sven Schlarb, Austrian Institute of Technology e-archiving, Cordoba, Spain 05 th October 2018

Size: px
Start display at page:

Download "Preparing Digital Collections for Big Data Analysis. Sven Schlarb, Austrian Institute of Technology e-archiving, Cordoba, Spain 05 th October 2018"

Transcription

1 Preparing Digital Collections for Big Data Analysis Sven Schlarb, Austrian Institute of Technology e-archiving, Cordoba, Spain 05 th October 2018

2 Digital Transformation Copyright Doc Searls,

3 Digital Transformation Copyright (network diagram) CC BY-SA 4.0

4 4 Archiving at internet scale

5 5 Is big data still a hype? 2014 BIG DATA Jeremykemp at English Wikipedia [GFDL ( or CC BY- SA 3.0 ( from Wikimedia Commons 05/10/2018

6 6 Is big data still a hype? 2015 Jeremykemp at English Wikipedia [GFDL ( or CC BY- SA 3.0 ( from Wikimedia Commons

7 7 Is big data still a hype? 2018 BIG DATA Jeremykemp at English Wikipedia [GFDL ( or CC BY-SA 3.0 ( from Wikimedia Commons

8 8 To SQL or to NoSQL? Relational databases NoSQL databases

9 Different Nosql database types K1 AAA,BBB,CCC K2 K3 K4 K5 AAA,BBB AAA,DDD AAA,2,01/01/2018 3,ZZZ,5623 Key-Value Wide Column Key Participant Conference ID Name City Name Address City 1 John London PVC2018 Townroad 2 Manchester 2 Linda Palme TFC2018 Market 2 Berlin NoSQL Databases { "name": "Sven Schlarb", " ": "sven.schlarbait.ac.at", "events": [ Event Graph Document { "name": "Kulturhackathon openglam.at", "date": " T00:00:00.000Z" Person }, { "name": "e-archving Cordoba", Person "date": " T00:00:00.000Z" } ] }

10 E-ARK Experimental Cluster Task Trackers Job Tracker Name Node CPU: 2 x 2.40GHz Quadcore CPU (16 HyperThreading cores) RAM: 24GB DISK: 3 x 1TB DISKs configured as RAID5 (Redundanz) 2 TB effective Data Nodes CPU: 1 x 2.53GHz Quadcore CPU (8 HyperThreading) RAM: 16GB DISK: 2 x 1TB DISKs configured as RAID0 (Performance) 2 TB effective Of 16 HT cores: 5 for Map; 2 for Reduce; 1 for OS. 25 processing cores for Map tasks 10 processing cores for Reduce tasks

11 Package transformation and Ingest Reference Implementation Full-text indexing & search Modular package transformation workflows & metadata creation Parallelize full-text indexing Access Fast random access to individual files Faceted Search & Data Mining Aggregating data using facet queries Data mining (Classification, NER)

12 E-ARK Information Package (simplified) SIP representations Metadata edits Migrations Add emulation info DIP metadata Structural metadata Provenance metadata Descriptive metadata Technical metadata SIP DIP [schemas/documentation]

13 earkweb earkweb is based on Phython and the Celery task execution system. Create archival workflows from predefined tasks which can be executed in parallel on a computer cluster. Examples are data validation, format migration, content extraction, database transformation, packaging, interfacing with storage systems. earkweb provides a graphical interface and can be used interactively as well as in batch mode.

14 Cluster Deployment Stack Information package status Task results <<search and retrieval>> decoupled <<notification>> Worker Worker Worker Worker Staging/Storage Area NAS <<package transfer>> 6/30/16

15 Standalone Deployment Stack Information package status Task results <<search and retrieval>> Worker Worker Worker Worker Staging/Storage Area NAS <<indexing>> 6/30/16

16 Data Mining/NLP Purpose: Analyse digital resources of collections Selected use cases: Location names occurring in texts. Named entity recognition and incorporation of geoinformation Text classification

17 Location names occurring in texts StanfordNER for NER nominatim (database behind openstreetmap.org) for georeferencing peripleo for visualization

18 Location names occurring in texts Peripleo - PELAGIOS Project

19 Geographical/timeline search Provided: GML data and TIFF images of maps with metadata (coordinate system, time, etc.) Convert GML data to Peripleo RDF Translate coordinate system if necessary Use peripleo to search for and visualize regions and filter by time Peripleo - PELAGIOS Project

20 Geographical/timeline search Peripleo - PELAGIOS Project

21 Text classification using scikit-learn Prepare data to train SVM classifier Dump full-texts of the repository into reusable packages Apply text classification and update SolR records accordingly

22 Database archiving, rebuilding and analysis e.g. Postgres SIARD e.g. Oracle RDBMS data (up to 80TB) Submit... Archive... Reconstruct... Analyse. source: wikipedia

23 Muchas Gracias por su atención! Hay preguntas?

Large Scale Processing with Hadoop

Large Scale Processing with Hadoop Large Scale Processing with Hadoop William Palmer Some slides courtesy of Per Møldrup-Dalum (State and University Library, Denmark) and Sven Schlarb (Austrian National Library) SCAPE Information Day British

More information

Real Time for Big Data: The Next Age of Data Management. Talksum, Inc. Talksum, Inc. 582 Market Street, Suite 1902, San Francisco, CA 94104

Real Time for Big Data: The Next Age of Data Management. Talksum, Inc. Talksum, Inc. 582 Market Street, Suite 1902, San Francisco, CA 94104 Real Time for Big Data: The Next Age of Data Management Talksum, Inc. Talksum, Inc. 582 Market Street, Suite 1902, San Francisco, CA 94104 Real Time for Big Data The Next Age of Data Management Introduction

More information

System Requirements EDT 6.0. discoveredt.com

System Requirements EDT 6.0. discoveredt.com System Requirements EDT 6.0 discoveredt.com Contents Introduction... 3 1 Components, Modules & Data Repositories... 3 2 Infrastructure Options... 5 2.1 Scenario 1 - EDT Portable or Server... 5 2.2 Scenario

More information

Analytics Platform for ATLAS Computing Services

Analytics Platform for ATLAS Computing Services Analytics Platform for ATLAS Computing Services Ilija Vukotic for the ATLAS collaboration ICHEP 2016, Chicago, USA Getting the most from distributed resources What we want To understand the system To understand

More information

SYSTEM REQUIREMENTS M.APP ENTERPRISE

SYSTEM REQUIREMENTS M.APP ENTERPRISE SYSTEM REQUIREMENTS M.APP ENTERPRISE Description or Document Category October 06, 2016 Contents M.App Enterprise Server... 3 Hardware requirements... 3 Disk space requirements... 3 Production environment

More information

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk

Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Bring Context To Your Machine Data With Hadoop, RDBMS & Splunk Raanan Dagan and Rohit Pujari September 25, 2017 Washington, DC Forward-Looking Statements During the course of this presentation, we may

More information

Performance Baselines and Recommendations. September 7, 2018 Version 9.4

Performance Baselines and Recommendations. September 7, 2018 Version 9.4 Performance Baselines and Recommendations September 7, 2018 Version 9.4 For the most recent version of this document, visit our documentation website. Table of Contents 1 Performance baselines and recommendations

More information

OKKAM-based instance level integration

OKKAM-based instance level integration OKKAM-based instance level integration Paolo Bouquet W3C RDF2RDB This work is co-funded by the European Commission in the context of the Large-scale Integrated project OKKAM (GA 215032) RoadMap Using the

More information

Autopsy as a Service Distributed Forensic Compute That Combines Evidence Acquisition and Analysis

Autopsy as a Service Distributed Forensic Compute That Combines Evidence Acquisition and Analysis Autopsy as a Service Distributed Forensic Compute That Combines Evidence Acquisition and Analysis Presentation to OSDFCon 2016 Dan Gonzales, Zev Winkelman, John Hollywood, Dulani Woods, Ricardo Sanchez,

More information

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah

Jure Leskovec Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah Jure Leskovec (@jure) Including joint work with Y. Perez, R. Sosič, A. Banarjee, M. Raison, R. Puttagunta, P. Shah 2 My research group at Stanford: Mining and modeling large social and information networks

More information

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou

The Hadoop Ecosystem. EECS 4415 Big Data Systems. Tilemachos Pechlivanoglou The Hadoop Ecosystem EECS 4415 Big Data Systems Tilemachos Pechlivanoglou tipech@eecs.yorku.ca A lot of tools designed to work with Hadoop 2 HDFS, MapReduce Hadoop Distributed File System Core Hadoop component

More information

vsphere Update Manager Installation and Administration Guide 17 APR 2018 VMware vsphere 6.7 vsphere Update Manager 6.7

vsphere Update Manager Installation and Administration Guide 17 APR 2018 VMware vsphere 6.7 vsphere Update Manager 6.7 vsphere Update Manager Installation and Administration Guide 17 APR 2018 VMware vsphere 6.7 vsphere Update Manager 6.7 You can find the most up-to-date technical documentation on the VMware website at:

More information

The OAIS Reference Model: current implementations

The OAIS Reference Model: current implementations The OAIS Reference Model: current implementations Michael Day, UKOLN, University of Bath m.day@ukoln.ac.uk Chinese-European Workshop on Digital Preservation, Beijing, China, 14-16 July 2004 Presentation

More information

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments

Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Sun Lustre Storage System Simplifying and Accelerating Lustre Deployments Torben Kling-Petersen, PhD Presenter s Name Principle Field Title andengineer Division HPC &Cloud LoB SunComputing Microsystems

More information

Can Enterprise Storage Fix Hadoop? PRESENTATION TITLE GOES HERE John Webster Senior Partner Evaluator Group

Can Enterprise Storage Fix Hadoop? PRESENTATION TITLE GOES HERE John Webster Senior Partner Evaluator Group Can Enterprise Storage Fix Hadoop? PRESENTATIN TITLE GES HERE John Webster Senior Partner Evaluator Group Agenda What is the Internet Data Center and how is it different from Enterprise Data Center? How

More information

REACH-IT Stakeholder Workshop. REACH-IT Architecture

REACH-IT Stakeholder Workshop. REACH-IT Architecture REACH-IT Stakeholder Workshop REACH-IT Architecture Aims of the presentation Introduce to the architecture of the REACH-IT application from different, complementary angles Functional [ Use Case and Logical

More information

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES

THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES 1 THE ATLAS DISTRIBUTED DATA MANAGEMENT SYSTEM & DATABASES Vincent Garonne, Mario Lassnig, Martin Barisits, Thomas Beermann, Ralph Vigne, Cedric Serfon Vincent.Garonne@cern.ch ph-adp-ddm-lab@cern.ch XLDB

More information

National Documentation Centre Open access in Cultural Heritage digital content

National Documentation Centre Open access in Cultural Heritage digital content National Documentation Centre Open access in Cultural Heritage digital content Haris Georgiadis, Ph.D. Senior Software Engineer EKT hgeorgiadis@ekt.gr The beginning.. 42 institutions documented & digitalized

More information

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU

Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU Ivane Javakhishvili Tbilisi State University High Energy Physics Institute HEPI TSU Grid cluster at the Institute of High Energy Physics of TSU Authors: Arnold Shakhbatyan Prof. Zurab Modebadze Co-authors:

More information

Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch

Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch Socrates: A System for Scalable Graph Analytics C. Savkli, R. Carr, M. Chapman, B. Chee, D. Minch September 10, 2014 Cetin Savkli Cetin.Savkli@jhuapl.edu 240 228 0115 Challenges of Big Data & Analytics

More information

What's New In Informatica Data Quality 9.0.1

What's New In Informatica Data Quality 9.0.1 What's New In Informatica Data Quality 9.0.1 2010 Abstract When you upgrade Informatica Data Quality to version 9.0.1, you will find multiple new features and enhancements. The new features include a new

More information

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti

Table 1 The Elastic Stack use cases Use case Industry or vertical market Operational log analytics: Gain real-time operational insight, reduce Mean Ti Solution Overview Cisco UCS Integrated Infrastructure for Big Data with the Elastic Stack Cisco and Elastic deliver a powerful, scalable, and programmable IT operations and security analytics platform

More information

Streamlining CASTOR to manage the LHC data torrent

Streamlining CASTOR to manage the LHC data torrent Streamlining CASTOR to manage the LHC data torrent G. Lo Presti, X. Espinal Curull, E. Cano, B. Fiorini, A. Ieri, S. Murray, S. Ponce and E. Sindrilaru CERN, 1211 Geneva 23, Switzerland E-mail: giuseppe.lopresti@cern.ch

More information

A never-ending database migration

A never-ending database migration A never-ending database migration Charles Delort IT-DB November 20, 2017 Table of Contents Years ago, decisions were made A few years later PostgreSQL Foreign Data Wrappers First step of Migration Apiato

More information

Modern Data Warehouse The New Approach to Azure BI

Modern Data Warehouse The New Approach to Azure BI Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics

More information

MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp

MixApart: Decoupled Analytics for Shared Storage Systems. Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp MixApart: Decoupled Analytics for Shared Storage Systems Madalin Mihailescu, Gokul Soundararajan, Cristiana Amza University of Toronto and NetApp Hadoop Pig, Hive Hadoop + Enterprise storage?! Shared storage

More information

The webinar will start soon... Elasticsearch Performance Optimisation

The webinar will start soon... Elasticsearch Performance Optimisation The webinar will start soon... Performance Optimisation 1 whoami Alan Hardy Sr. Solutions Architect NEMEA 2 Webinar Housekeeping & Logistics Slides and recording will be available following the webinar

More information

Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators

Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators WHITE PAPER Accelerating Enterprise Search with Fusion iomemory PCIe Application Accelerators Western Digital Technologies, Inc. 951 SanDisk Drive, Milpitas, CA 95035 www.sandisk.com Table of Contents

More information

VCP410 VMware vsphere Cue Cards

VCP410 VMware vsphere Cue Cards VMware ESX 4.0 will only install and run on servers with 64-bit x86 CPUs. ESX 4.0 Requires 2GB RAM minimum ESX 4.0 requires 1 or more network adapters ESX 4.0 requires a SCSI disk, Fibre Channel LUN, or

More information

Migrating from FAST to EMC Documentum xplore: What To Do and Why You'll Love It. Ed Bueché EMC Distinguished Engineer and xplore Architect

Migrating from FAST to EMC Documentum xplore: What To Do and Why You'll Love It. Ed Bueché EMC Distinguished Engineer and xplore Architect Migrating from FAST to EMC Documentum xplore: What To Do and Why You'll Love It Ed Bueché EMC Distinguished Engineer and xplore Architect Agenda Introduction to xplore xplore 1.2 new capabilities FAST-to-xPlore

More information

Data Capture Recommended Operating Environments

Data Capture Recommended Operating Environments Oracle Insurance Data Capture Recommended Operating Environments Release 5.2 October 2014 CONTENTS STATEMENT OF PURPOSE... 3 OIDC Hardware Configuration Example... 4 OIDC Workflow Example... 5 QUICK VIEW...

More information

Elasticsearch & ATLAS Data Management. European Organization for Nuclear Research (CERN)

Elasticsearch & ATLAS Data Management. European Organization for Nuclear Research (CERN) Elasticsearch & ATAS Data Management European Organization for Nuclear Research (CERN) ralph.vigne@cern.ch mario.lassnig@cern.ch ATAS Analytics Platform proposed eb. 2015; work in progress; correlate data

More information

IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform

IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform IBM Scale Out Network Attached Storage (SONAS) using the Acuo Universal Clinical Platform A vendor-neutral medical-archive offering Dave Curzio IBM Systems and Technology Group ISV Enablement February

More information

General Model of E-ARK Services

General Model of E-ARK Services General Model of E-ARK Services DLM Forum Members Meeting 10-11 June 2014, Athens Istvan Alföldi National Archives of Hungary Agenda E-ARK General Model Conceptual framework Used methodology Results (not

More information

High Performance Computing on MapReduce Programming Framework

High Performance Computing on MapReduce Programming Framework International Journal of Private Cloud Computing Environment and Management Vol. 2, No. 1, (2015), pp. 27-32 http://dx.doi.org/10.21742/ijpccem.2015.2.1.04 High Performance Computing on MapReduce Programming

More information

Meridian. Technical Specifications

Meridian. Technical Specifications Meridian Technical Specifications Debt Management Unit Commonwealth Secretariat 2017 Commonwealth Secretariat, all rights reserved Copyright of the whole and any part of this document is owned by the Commonwealth

More information

LABEL ARCHIVE Administrator s Guide

LABEL ARCHIVE Administrator s Guide LABEL ARCHIVE Administrator s Guide DOC-LAS2015_25/05/2015 The information in this manual is not binding and may be modified without prior notice. Supply of the software described in this manual is subject

More information

A Web Service for Scholarly Big Data Information Extraction

A Web Service for Scholarly Big Data Information Extraction A Web Service for Scholarly Big Data Information Extraction Kyle Williams, Lichi Li, Madian Khabsa, Jian Wu, Patrick C. Shih and C. Lee Giles Information Sciences and Technology Computer Science and Engineering

More information

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples

Topics. Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples Hadoop Introduction 1 Topics Big Data Analytics What is and Why Hadoop? Comparison to other technologies Hadoop architecture Hadoop ecosystem Hadoop usage examples 2 Big Data Analytics What is Big Data?

More information

Resource and Performance Distribution Prediction for Large Scale Analytics Queries

Resource and Performance Distribution Prediction for Large Scale Analytics Queries Resource and Performance Distribution Prediction for Large Scale Analytics Queries Prof. Rajiv Ranjan, SMIEEE School of Computing Science, Newcastle University, UK Visiting Scientist, Data61, CSIRO, Australia

More information

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved

Hadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop

More information

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment

SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment SharePoint 2010 Technical Case Study: Microsoft SharePoint Server 2010 Enterprise Intranet Collaboration Environment This document is provided as-is. Information and views expressed in this document, including

More information

Cloud Computing & Visualization

Cloud Computing & Visualization Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International

More information

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc

FuxiSort. Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc Fuxi Jiamang Wang, Yongjun Wu, Hua Cai, Zhipeng Tang, Zhiqiang Lv, Bin Lu, Yangyu Tao, Chao Li, Jingren Zhou, Hong Tang Alibaba Group Inc {jiamang.wang, yongjun.wyj, hua.caihua, zhipeng.tzp, zhiqiang.lv,

More information

Austrian Statistical Datawarehouse (sdwh)

Austrian Statistical Datawarehouse (sdwh) Eliane Schwerer Registers, Classifications and Geoinformation Geneva 11 th 13 th April 2018 Austrian Statistical Datawarehouse (sdwh) an application of the GSIM model www.statistik.at We provide information

More information

Big Data Facebook

Big Data Facebook Big Data Architectures@ Facebook QCon London 2012 Ashish Thusoo Outline Big Data @ Facebook - Scope & Scale Evolution of Big Data Architectures @ FB Past, Present and Future Questions Big Data @ FB: Scale

More information

Cisco Unified Provisioning Manager 2.2

Cisco Unified Provisioning Manager 2.2 Cisco Unified Provisioning Manager 2.2 General Q. What is Cisco Unified Provisioning Manager (UPM)? A. Cisco Unified Provisioning Manager is part of the Cisco Unified Communications Management Suite. Cisco

More information

Extending the Scope of Custom Transformations

Extending the Scope of Custom Transformations Paper 3306-2015 Extending the Scope of Custom Transformations Emre G. SARICICEK, The University of North Carolina at Chapel Hill. ABSTRACT Building and maintaining a data warehouse can require complex

More information

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework

International Journal of Advance Engineering and Research Development. A Study: Hadoop Framework Scientific Journal of Impact Factor (SJIF): e-issn (O): 2348- International Journal of Advance Engineering and Research Development Volume 3, Issue 2, February -2016 A Study: Hadoop Framework Devateja

More information

Cisco and Cloudera Deliver WorldClass Solutions for Powering the Enterprise Data Hub alerts, etc. Organizations need the right technology and infrastr

Cisco and Cloudera Deliver WorldClass Solutions for Powering the Enterprise Data Hub alerts, etc. Organizations need the right technology and infrastr Solution Overview Cisco UCS Integrated Infrastructure for Big Data and Analytics with Cloudera Enterprise Bring faster performance and scalability for big data analytics. Highlights Proven platform for

More information

FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM

FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM FLORIDA DEPARTMENT OF TRANSPORTATION PRODUCTION BIG DATA PLATFORM RECOMMENDATION AND JUSTIFACTION Executive Summary: VHB has been tasked by the Florida Department of Transportation District Five to design

More information

Ingo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA Copyright 2003, SAS Institute Inc. All rights reserved.

Ingo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA Copyright 2003, SAS Institute Inc. All rights reserved. Intelligent Storage Results from real life testing Ingo Brenckmann Jochen Kirsten Storage Technology Strategists SAS EMEA SAS Intelligent Storage components! OLAP Server! Scalable Performance Data Server!

More information

THIS ADDENDUM IS TO PROVIDE CLARIFICATION AND ANSWERS TO QUESTIONS THAT WERE SUBMITTED FOR THIS REP. QUESTIONS AND ANSWERS ARE ATTACHED.

THIS ADDENDUM IS TO PROVIDE CLARIFICATION AND ANSWERS TO QUESTIONS THAT WERE SUBMITTED FOR THIS REP. QUESTIONS AND ANSWERS ARE ATTACHED. TULSA COUNTY PURCHASING DEPARTMENT MEMO DATE: FROM: TO: SUBJECT: DECEMBER 15, 2015 LINDA R. DORRELL PURCHASING DIRECTOR BOARD OF COUNTY COMMI~ ADDENDUM #1- RFP- BACKUP AND RESTORAL SOLUTION ON NOVEMBER

More information

Phire 12.2 Hardware and Software Requirements

Phire 12.2 Hardware and Software Requirements Phire 12.2 Hardware and Software Requirements Copyright 2017, Phire. All rights reserved. The Programs (which include both the software and documentation) contain proprietary information; they are provided

More information

System Requirements. Hardware and Virtual Appliance Requirements

System Requirements. Hardware and Virtual Appliance Requirements This chapter provides a link to the Cisco Secure Network Server Data Sheet and lists the virtual appliance requirements. Hardware and Virtual Appliance Requirements, page 1 Virtual Machine Appliance Size

More information

Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000

Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000 Microsoft SQL Server 2012 Fast Track Reference Architecture Using PowerEdge R720 and Compellent SC8000 This whitepaper describes the Dell Microsoft SQL Server Fast Track reference architecture configuration

More information

A Fast and High Throughput SQL Query System for Big Data

A Fast and High Throughput SQL Query System for Big Data A Fast and High Throughput SQL Query System for Big Data Feng Zhu, Jie Liu, and Lijie Xu Technology Center of Software Engineering, Institute of Software, Chinese Academy of Sciences, Beijing, China 100190

More information

Pinnacle 3 SmartEnterprise

Pinnacle 3 SmartEnterprise Pinnacle 3 SmartEnterprise Pinnacle 3 SmartEnterprise centralized computing platform X6-2 specifications sheet Scalable capacity and robust healthcare IT integration for high volume clinics Built for high

More information

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018

CC PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018 CC5212-1 PROCESAMIENTO MASIVO DE DATOS OTOÑO 2018 Lecture 1: Introduction Aidan Hogan aidhog@gmail.com THE VALUE OF DATA Soho, London, 1854 Cholera: What we know now Cholera: What we knew in 1854 1854:

More information

Oracle NoSQL Database and Cisco- Collaboration that produces results. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved.

Oracle NoSQL Database and Cisco- Collaboration that produces results. 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. Oracle NoSQL Database and Cisco- Collaboration that produces results 1 Copyright 2011, Oracle and/or its affiliates. All rights reserved. What is Big Data? SOCIAL BLOG SMART METER VOLUME VELOCITY VARIETY

More information

vsphere Installation and Setup Update 2 Modified on 10 JULY 2018 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5

vsphere Installation and Setup Update 2 Modified on 10 JULY 2018 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5 vsphere Installation and Setup Update 2 Modified on 10 JULY 2018 VMware vsphere 6.5 VMware ESXi 6.5 vcenter Server 6.5 You can find the most up-to-date technical documentation on the VMware website at:

More information

White Paper. The Architecture and Security of SAS Marketing Operations Management

White Paper. The Architecture and Security of SAS Marketing Operations Management White Paper The Architecture and Security of SAS Marketing Operations Management Contents Introduction... 1 High-Level Architecture Overview... 1 SAS Marketing Operations Management Foundation... 3 Marketing

More information

Session Two: OAIS Model & Digital Curation Lifecycle Model

Session Two: OAIS Model & Digital Curation Lifecycle Model From the SelectedWorks of Group 4 SundbergVernonDhaliwal Winter January 19, 2016 Session Two: OAIS Model & Digital Curation Lifecycle Model Dr. Eun G Park Available at: https://works.bepress.com/group4-sundbergvernondhaliwal/10/

More information

HPSS RAIT. A high performance, resilient, fault-tolerant tape data storage class. 1

HPSS RAIT. A high performance, resilient, fault-tolerant tape data storage class.   1 HPSS RAIT A high performance, resilient, fault-tolerant tape data storage class http://www.hpss-collaboration.org 1 Why RAIT? HPSS supports striped tape without RAIT o Conceptually similar to RAID 0 o

More information

Oracle Hospitality Materials Control. Server Sizing Guide

Oracle Hospitality Materials Control. Server Sizing Guide Oracle Hospitality Materials Control Server Sizing Guide Release 18.1 E96487-04 April 2019 Oracle Hospitality Materials Control Server Sizing Guide, Release 18.1 E96487-04 Copyright 1998, 2019, Oracle

More information

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung

HDFS: Hadoop Distributed File System. CIS 612 Sunnie Chung HDFS: Hadoop Distributed File System CIS 612 Sunnie Chung What is Big Data?? Bulk Amount Unstructured Introduction Lots of Applications which need to handle huge amount of data (in terms of 500+ TB per

More information

Applied Interoperability in Digital Preservation: Solutions from the E-ARK Project

Applied Interoperability in Digital Preservation: Solutions from the E-ARK Project Applied Interoperability in Digital Preservation: Solutions from the E-ARK Project Kuldar Aas National Archives of Estonia J. Liivi 4 Tartu, 50409, Estonia +372 7387 543 Kuldar.Aas@ra.ee Andrew Wilson

More information

Mixing and matching virtual and physical HPC clusters. Paolo Anedda

Mixing and matching virtual and physical HPC clusters. Paolo Anedda Mixing and matching virtual and physical HPC clusters Paolo Anedda paolo.anedda@crs4.it HPC 2010 - Cetraro 22/06/2010 1 Outline Introduction Scalability Issues System architecture Conclusions & Future

More information

This document lists hardware and software requirements for Connected Backup

This document lists hardware and software requirements for Connected Backup Autonomy Connected Backup Version 8.8.0.2 Matrix Revision 1 This document lists hardware and software requirements for Connected Backup 8.8.0.2. Data Center This section lists the installation requirements

More information

Cognos Dynamic Cubes

Cognos Dynamic Cubes Cognos Dynamic Cubes Amit Desai Cognos Support Engineer Open Mic Facilitator Reena Nagrale Cognos Support Engineer Presenter Gracy Mendonca Cognos Support Engineer Technical Panel Member Shashwat Dhyani

More information

Polarion 18.2 Enterprise Setup

Polarion 18.2 Enterprise Setup SIEMENS Polarion 18.2 Enterprise Setup POL005 18.2 Contents Overview........................................................... 1-1 Terminology..........................................................

More information

BIG DATA COURSE CONTENT

BIG DATA COURSE CONTENT BIG DATA COURSE CONTENT [I] Get Started with Big Data Microsoft Professional Orientation: Big Data Duration: 12 hrs Course Content: Introduction Course Introduction Data Fundamentals Introduction to Data

More information

The power of centralized computing at your fingertips

The power of centralized computing at your fingertips Pinnacle 3 Professional The power of centralized computing at your fingertips Philips Pinnacle 3 Professional specifications The power of centralized computing in a scalable offering for mid-size clinics

More information

Cisco Prime Home 6.X Minimum System Requirements: Standalone and High Availability

Cisco Prime Home 6.X Minimum System Requirements: Standalone and High Availability White Paper Cisco Prime Home 6.X Minimum System Requirements: Standalone and High Availability White Paper August 2014 2014 Cisco and/or its affiliates. All rights reserved. This document is Cisco Public

More information

ColumnStore Indexes. מה חדש ב- 2014?SQL Server.

ColumnStore Indexes. מה חדש ב- 2014?SQL Server. ColumnStore Indexes מה חדש ב- 2014?SQL Server דודאי מאיר meir@valinor.co.il 3 Column vs. row store Row Store (Heap / B-Tree) Column Store (values compressed) ProductID OrderDate Cost ProductID OrderDate

More information

Incremental Export of Relational Database Contents into RDF Graphs

Incremental Export of Relational Database Contents into RDF Graphs National Technical University of Athens School of Electrical and Computer Engineering Multimedia, Communications & Web Technologies Incremental Export of Relational Database Contents into RDF Graphs Nikolaos

More information

Storage Solution : ONTAP Select. Netapp Interoperability Matrix

Storage Solution : ONTAP Select. Netapp Interoperability Matrix Storage Solution : Netapp Interoperability Matrix Search Criteria Solution; Platform ; Premium (FDvM300 ; ); Premium (FDvM300); Name Status Foot notes Platform 20171109-185338964 20171109-184927422 20171109-182752834

More information

Lily 2.4 What s New Product Release Notes

Lily 2.4 What s New Product Release Notes Lily 2.4 What s New Product Release Notes WHAT S NEW IN LILY 2.4 2 Table of Contents Table of Contents... 2 Purpose and Overview of this Document... 3 Product Overview... 4 General... 5 Prerequisites...

More information

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III

Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III [ White Paper Dell Microsoft Business Intelligence and Data Warehousing Reference Configuration Performance Results Phase III Performance of Microsoft SQL Server 2008 BI and D/W Solutions on Dell PowerEdge

More information

Oracle Database 11g: New Features for Administrators DBA Release 2

Oracle Database 11g: New Features for Administrators DBA Release 2 Oracle Database 11g: New Features for Administrators DBA Release 2 Duration: 5 Days What you will learn This Oracle Database 11g: New Features for Administrators DBA Release 2 training explores new change

More information

Introduction to MapReduce Algorithms and Analysis

Introduction to MapReduce Algorithms and Analysis Introduction to MapReduce Algorithms and Analysis Jeff M. Phillips October 25, 2013 Trade-Offs Massive parallelism that is very easy to program. Cheaper than HPC style (uses top of the line everything)

More information

Next Generation DWH Modeling. An overview of DWH modeling methods

Next Generation DWH Modeling. An overview of DWH modeling methods Next Generation DWH Modeling An overview of DWH modeling methods Ronald Kunenborg www.grundsatzlich-it.nl Topics Where do we stand today Data storage and modeling through the ages Current data warehouse

More information

VMware vrealize Log Insight Getting Started Guide

VMware vrealize Log Insight Getting Started Guide VMware vrealize Log Insight Getting Started Guide vrealize Log Insight 2.5 This document supports the version of each product listed and supports all subsequent versions until the document is replaced

More information

Oracle Big Data Fundamentals Ed 2

Oracle Big Data Fundamentals Ed 2 Oracle University Contact Us: 1.800.529.0165 Oracle Big Data Fundamentals Ed 2 Duration: 5 Days What you will learn In the Oracle Big Data Fundamentals course, you learn about big data, the technologies

More information

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu

Database Architecture 2 & Storage. Instructor: Matei Zaharia cs245.stanford.edu Database Architecture 2 & Storage Instructor: Matei Zaharia cs245.stanford.edu Summary from Last Time System R mostly matched the architecture of a modern RDBMS» SQL» Many storage & access methods» Cost-based

More information

Orchestrating Big Data with Apache Airflow

Orchestrating Big Data with Apache Airflow Orchestrating Big Data with Apache Airflow July 2016 Airflow allows developers, admins and operations teams to author, schedule and orchestrate workflows and jobs within an organization. While it s main

More information

SMORE: A Cold Data Object Store for SMR Drives

SMORE: A Cold Data Object Store for SMR Drives SMORE: A Cold Data Object Store for SMR Drives Peter Macko, Xiongzi Ge, John Haskins Jr.*, James Kelley, David Slik, Keith A. Smith, and Maxim G. Smith Advanced Technology Group NetApp, Inc. * Qualcomm

More information

Project: Configure ArcGIS Server 10 using Microsoft Server 2008 Failover Cluster

Project: Configure ArcGIS Server 10 using Microsoft Server 2008 Failover Cluster July 25, 2012 Project: Configure ArcGIS Server 10 using Microsoft Server 2008 Failover Cluster Presented by Philip Dunn, Senior Consultant / Solution Architect POWER Engineers Established 1976 100% employee

More information

Digibess: thanks Islandora! Arcidosso Italy March, 20-22, Giancarlo Birello, Anna Perin IT office and Library CNR-Ceris

Digibess: thanks Islandora! Arcidosso Italy March, 20-22, Giancarlo Birello, Anna Perin IT office and Library CNR-Ceris Digibess: thanks Islandora! Arcidosso Italy March, 20-22, 2013 Giancarlo Birello, Anna Perin IT office and Library CNR-Ceris BESS : group of 18 socioeconomic libraries in Piemonte (Italy) The libraries

More information

X-ray imaging software tools for HPC clusters and the Cloud

X-ray imaging software tools for HPC clusters and the Cloud X-ray imaging software tools for HPC clusters and the Cloud Darren Thompson Application Support Specialist 9 October 2012 IM&T ADVANCED SCIENTIFIC COMPUTING NeAT Remote CT & visualisation project Aim:

More information

Preservation Planning in the OAIS Model

Preservation Planning in the OAIS Model Preservation Planning in the OAIS Model Stephan Strodl and Andreas Rauber Institute of Software Technology and Interactive Systems Vienna University of Technology {strodl, rauber}@ifs.tuwien.ac.at Abstract

More information

LONG-TERM PRESERVATION OF DATABASES THE MEANINGFUL WAY

LONG-TERM PRESERVATION OF DATABASES THE MEANINGFUL WAY LONG-TERM PRESERVATION OF DATABASES THE MEANINGFUL WAY Janet Delve University of Portsmouth School of Creative Technologies Eldon Building, Winston Churchill Avenue, Portsmouth, PO12DJ, UK +44 2392 845524

More information

Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch

Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch Working with a Preservation Software Vendor - The Kentucky Experience Glen McAninch Kentucky Department for Libraries and Archives November 2014 Best Practices Exchange Montgomery, Alabama Who We Are Kentucky

More information

Cisco Configuration Engine 2.0

Cisco Configuration Engine 2.0 Cisco Configuration Engine 2.0 The Cisco Configuration Engine provides a unified, secure solution for automating the deployment of Cisco customer premises equipment (CPE). This scalable product distributes

More information

5 Fundamental Strategies for Building a Data-centered Data Center

5 Fundamental Strategies for Building a Data-centered Data Center 5 Fundamental Strategies for Building a Data-centered Data Center June 3, 2014 Ken Krupa, Chief Field Architect Gary Vidal, Solutions Specialist Last generation Reference Data Unstructured OLTP Warehouse

More information

BusinessObjects Metadata Management XI 3.0 for Windows

BusinessObjects Metadata Management XI 3.0 for Windows BusinessObjects Metadata Management XI 3.0 for Windows Supported Platforms Overview Contents This document lists specific platforms and configurations for BusinessObjects Metadata Management XI 3.0 for

More information

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( )

CIS 601 Graduate Seminar. Dr. Sunnie S. Chung Dhruv Patel ( ) Kalpesh Sharma ( ) Guide: CIS 601 Graduate Seminar Presented By: Dr. Sunnie S. Chung Dhruv Patel (2652790) Kalpesh Sharma (2660576) Introduction Background Parallel Data Warehouse (PDW) Hive MongoDB Client-side Shared SQL

More information

Data Science with PostgreSQL

Data Science with PostgreSQL Balázs Bárány Data Scientist pgconf.de 2015 Contents Introduction What is Data Science? Process model Tools and methods of Data Scientists Business & data understanding Preprocessing Modeling Evaluation

More information

Contract Information Management System (CIMS) Technical System Architecture

Contract Information Management System (CIMS) Technical System Architecture Technical System REVISION HISTORY REVISION NUMBER ISSUE DATE PRIMARY AUTHOR(S) NOTES 1.0 2/2015 Cheryl Kelmar Software: Kami Phengphet Engineer: Pornpat Nikamanon Architect: Jim Zhou Creation of CIMS document.

More information

Pandektis: Implementing a repository of greek historical and cultural material with DSpace

Pandektis: Implementing a repository of greek historical and cultural material with DSpace Pandektis: Implementing a repository of greek historical and cultural material with DSpace Nikos Houssos Ilias Stavrakis Kostas Stamatis Ioanna-Ourania Stathopoulou Christina Paschou National Documentation

More information