Innovation at AWS. Eric Ferreira Principal Database Engineer Amazon Redshift
|
|
- Jeffrey Watkins
- 6 years ago
- Views:
Transcription
1 Innovation at AWS Eric Ferreira Principal Database Engineer Amazon Redshift
2 The Amazon Flywheel Focus on things that stay the same Price Selection Delivery
3 Applying this at AWS
4 Focus on things that stay the same Performance Amazon Redshift Value Simplicity
5 Adopt a retail mindset
6 Customers have choice Delight them and they ll stay Earn their business one hour at a time
7 Start with the Customer Work Backwards
8 What Do Customers Want? What problems are customers facing? How will my service alleviate this pain? Why will this idea delight customers? Why can I do this better than anyone else?
9 What we heard from customers about DW Complicated to install, maintain, operate Require large upfront payments Too expensive Always running out of capacity
10 Press Release Describe the product in terms of customer value Why will customers care? Is it newsworthy? How is this differentiated?
11 FAQ Answer customer questions How does this help me? How do I get started? How will this work with my ETL/BI tools? When should I use this vs. Hadoop?
12 2 pizza teams An individual team should be no larger than can be fed by two pizzas. Beyond this size, you define contracts and interfaces with other teams Attention is a scarce resource. Time is a scarce resource Apply attention and time to changing reality, not communicating status.
13 Build the Product Assemble a Team Build Internal Beta Private Beta Launch Iterate
14 Iterate
15 Get Feedback Add Features that matter Increase Adoption Raise Value
16 Redshift pushes a new DB version every two weeks features since launch Temp Credentials (4/11) Unload logs (7/5) Sharing snapshots (7/18) DUB (4/25) SOC1/2/3 (5/8) JDBC Fetch Size (6/27) Service Launch (2/14) Resource Level IAM (8/9) SHA1 Builtin (7/15) Statement Timeout (7/22) WLM Timeout/Wildcards (8/1) UTF-8 Substitution (8/29) Kinesis EMR/HDFS/SSH copy, Distributed Tables, Audit Logging/CloudTrail, Concurrency, Resize Perf., Approximate Count Distinct, SNS Alerts, Cross Region Backup (11/13) 3 new regex features, Unload to single file, FedRAMP(5/6) Resize progress indicator & Cluster Version (3/21) New query monitoring system tables and Split_part, Audit tables (10/3) diststyle all (1/13) 50 slots, COPY from EMR, ECDHE EIP Support for VPC Clusters (12/28) ciphers (4/22) PDX (4/2) NRT (6/5) Unload Encrypted Files PCI (8/22) CRC32 Builtin, CSV, Restore Progress HSM Support (11/11) (8/9) Timezone, Epoch, Autoformat (7/25) 4 byte UTF-8 (7/18) SIN/SYD (10/8) JSON, Regex, Cursors (9/10) Redshift on DW2 (SSD) Nodes (1/23) Distributed Tables, Single Node Cursor Support, Maximum Connections to 500 (12/13) Regex_Substr, COPY from JSON (3/25) Compression for COPY from SSH, Fetch size support for single node clusters, new system tables with commit stats, row_number(), strotol() and query termination (2/13)
17 Collect Store Analyze EMR Athena Direct Connect AWS Import/ Export Snowball S3 Glacier Redshift Machine Learning Kinesis AWS IoT DynamoDB Elasticsearch EC2 QuickSight Lambda AWS Database Migration Service AWS Glue
18 Collection & Storage Store anything Object storage Designed for % durability Amazon S3 Scalable & Cost effective; $0.023/GB-Mo Integrated with Amazon Glacier Support for multiple encryption methods; integrated with AWS KMS, with support for external HSMs
19 Data Management & ETL Hive Metastore-compatible data catalog with integrated crawlers for schema, data type, and partition inference Generates Python code to move data from source to destination AWS Glue Edit jobs using your favorite IDE and share snippets via Git Runs jobs in Spark containers that auto-scale based on SLA Serverless with no infrastructure to manage; pay only for the resources you consume
20 Amazon RDS for Aurora MySQL compatible with up to 5x better performance on the same hardware: 100,000 writes/sec & 500,000 reads/sec Scalable with up to 64 TB in single database, up to 15 read replicas Highly available, durable, and fault-tolerant custom SSD storage layer: 6-way replicated across 3 Availability Zones Transparent encryption for data at rest using AWS KMS Stored procedures in Aurora can invoke AWS Lambda functions MySQL & PostgreSQL compatible engines
21 Structured Data Processing Petabyte-scale relational, MPP, data warehousing clusters with the ability to join across Exabytes of data in S3 using Redshift Spectrum, a serverless scale out query layer that charges $5/TB scanned Fully managed with SSD and HDD platforms Built-in end to end security, including customer-managed keys Fault tolerant. Automatically recovers from disk and node failures Amazon Redshift Data automatically backed up to Amazon S3 with cross region backup capability for global disaster recovery $1,000/TB/Year; start at $0.25/hour. Provision in minutes; scale from 160GB to 2PB of compressed data with just a few clicks
22 Semi-structured / Unstructured Data Processing Hadoop, Hive, Presto, Spark, Tez, Impala etc. Release 5.3: Hadoop 2.7.3, Hive 2.1, Spark 2.1, Zeppelin, Presto, HBase and HBase on S3, Phoenix, Tez, Flink. New applications added within 30 days of their open source release Fully managed, autoscaling clusters with support for on-demand and spot pricing Amazon EMR Support for HDFS and S3 filesystems enabling separated compute and storage; multiple clusters can run against the same data in S3 HIPAA-eligible. Support for end-to-end encryption, IAM/VPC, S3 client-side encryption with customer managed keys and AWS KMS
23 Serverless Query Processing Serverless query service for querying data in S3 using standard SQL, with no infrastructure to manage No data loading required; query directly from Amazon S3 Amazon Athena Use standard ANSI SQL queries with support for joins, JSON, and window functions Support for multiple data formats include text, CSV, TSV, JSON, Avro, ORC, Parquet Pay per query only when you re running queries based on data scanned. If you compress your data, you pay less and your queries run faster
24 Serverless Event Processing Server-less compute service that runs your code in response to events Extend AWS services with user defined custom logic AWS Lambda Write custom code in Node.js, Python, and Java Pay only for the requests served and compute time required - billing in increments of 100 milliseconds
25 Stream Processing Real-time stream processing High throughput; elastic Highly available; data replicated across multiple Availability Zones with configurable retention Amazon Kinesis S3, Redshift, DynamoDB Integrations Kinesis Streams for custom streaming applications; Kinesis Firehose for easy integration with Amazon S3 and Redshift; Kinesis Analytics for streaming SQL
26 Search and Operational Analytics Distributed search and analytics engine Managed service using Elasticsearch and Kibana Fully managed; Zero admin Amazon Elasticsearch Service Highly Available and Reliable Tightly integrated with other AWS services
27 Predictive Applications Easy to use, managed service built for developers - Deploy models to in seconds Robust, powerful technology based on Amazon s internal systems Amazon ML Create models using your data already stored in the AWS cloud; deploy models in batch and real time modes Spark on Amazon EMR also available for custom machine learning applications
28 Business Intelligence Fast and cloud-powered Easy to use, no infrastructure to manage Amazon QuickSight Scales to 100s of thousands of users Quick calculations with SPICE 1/10th the cost of legacy BI software
29 Amazon Redshift
30 Amazon SWF Amazon VPC AWS IAM Amazon EC2 OLAP MPP Columnar PostgreSQL Amazon Redshift Amazon S3 AWS KMS Amazon Route 53 Amazon CloudWatch
31 Redshift Cluster Architecture Massively parallel, shared nothing Leader node SQL endpoint Stores metadata Coordinates parallel SQL processing Compute nodes Local, columnar storage Executes queries in parallel Load, backup, restore Ingestion Backup Restore 128GB RAM Compute 16 cores Node 16TB disk 10 GigE (HPC) SQL Clients/BI Tools 128GB RAM Leader 16 cores Node 16TB disk 128GB RAM Compute 16 cores Node 16TB disk JDBC/ODBC 128GB RAM Compute 16 cores Node 16TB disk S3 / EMR / DynamoDB / SSH
32 Brute force only takes you so far
33 Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt CREATE TABLE audience ( aid INT --audience_id,loc CHAR(3) --location,dt DATE --date ); aid loc dt 1 SFO JFK SFO JFK Accessing dt with row storage: Need to read everything Unnecessary I/O
34 Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt CREATE TABLE audience ( aid INT --audience_id,loc CHAR(3) --location,dt DATE --date ); aid loc dt 1 SFO JFK SFO JFK Accessing dt with columnar storage: Only scan blocks for relevant column
35 Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt CREATE TABLE audience ( aid INT ENCODE LZO,loc CHAR(3) ENCODE BYTEDICT,dt DATE ENCODE RUNLENGTH ); aid loc dt 1 SFO JFK SFO JFK Columns grow and shrink independently Effective compression ratios due to like data Reduces storage requirements Reduces I/O
36 Designed for I/O Reduction Columnar storage Data compression Zone maps aid loc dt CREATE TABLE audience ( aid INT --audience_id,loc CHAR(3) --location,dt DATE --date ); aid loc dt 1 SFO JFK SFO JFK In-memory block metadata Contains per-block MIN and MAX value Effectively prunes blocks which cannot contain data for a given query Eliminates unnecessary I/O
37 Zone Maps SELECT COUNT(*) FROM LOGS WHERE DATE = '09-JUNE-2013' Unsorted Table Sorted By Date MIN: 01-JUNE-2013 MAX: 20-JUNE-2013 MIN: 08-JUNE-2013 MAX: 30-JUNE-2013 MIN: 01-JUNE-2013 MAX: 06-JUNE-2013 MIN: 07-JUNE-2013 MAX: 12-JUNE-2013 MIN: 12-JUNE-2013 MAX: 20-JUNE-2013 MIN: 13-JUNE-2013 MAX: 18-JUNE-2013 MIN: 02-JUNE-2013 MAX: 25-JUNE-2013 MIN: 19-JUNE-2013 MAX: 24-JUNE-2013
38 Data Distribution Distribution style is a table property which dictates how that table s data is distributed throughout the cluster: KEY: Value is hashed, same value goes to same location (slice) ALL: Full table data goes to first slice of every node EVEN: Round robin Goals: Distribute data evenly for parallel processing Minimize data movement during query processing ALL EVEN Slice 1 Slice 2 Node 1 KEY Slice 3 Slice 4 Node 2 Slice 1 Slice 2 Slice 3 Slice 4 Slice 1 Slice 2 Slice 3 Slice 4 Node 1 Node 2 Node 1 Node 2
39 What is next? When your data sets become so large and diverse that you have to start innovating around how to collect, store, process, analyze and share them
40 The Dark Data Problem Most generated data is unavailable for analysis Data Volume Generated Data Available for Analysis Year Sources: Gartner: User Survey Analysis: Key Trends Shaping the Future of Data Center Infrastructure Through 2011 IDC: Worldwide Business Analytics Software Forecast and 2011 Vendor Shares
41 The tyranny of OR Amazon EMR Amazon Redshift Directly access data in S3 Super-fast local disk performance Scale out to thousands of nodes Sophisticated query optimization Open data formats Join-optimized data formats Popular big data frameworks Query using standard SQL Anything you can dream up and code Optimized for data warehousing
42 Customers want sophisticated query optimization and scale-out processing super fast performance and support for open formats the throughput of local disk and the scale of S3
43 Amazon Redshift Spectrum
44 Amazon Redshift Spectrum Run SQL queries directly against data in S3 using thousands of nodes exabyte scale Elastic & highly available On-demand, pay-per-query High concurrency: Multiple clusters access same data S3 No ETL: Query data in-place using open file formats SQL Full Amazon Redshift SQL support
45 Life of a query 1 Query SELECT COUNT(*) FROM S3.EXT_TABLE GROUP BY JDBC/ODBC Amazon Redshift N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
46 Life of a query JDBC/ODBC Amazon Redshift 2 Query is optimized and compiled at the leader node. Determine what gets run locally and what goes to Amazon Redshift Spectrum N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
47 Life of a query JDBC/ODBC Amazon Redshift 3 Query plan is sent to all compute nodes N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
48 Life of a query JDBC/ODBC Amazon Redshift 4 Compute nodes obtain partition info from Data Catalog; dynamically prune partitions N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
49 Life of a query JDBC/ODBC Amazon Redshift 5 Each compute node issues multiple requests to the Amazon Redshift Spectrum layer N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
50 Life of a query JDBC/ODBC Amazon Redshift N... 6 Amazon Redshift Spectrum nodes scan your S3 data Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
51 Life of a query JDBC/ODBC Amazon Redshift 7 Amazon Redshift Spectrum projects, filters, joins and aggregates N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
52 Life of a query JDBC/ODBC Amazon Redshift 8 Final aggregations and joins with local Amazon Redshift tables done in-cluster N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
53 Life of a query JDBC/ODBC 9 Result is sent back to client Amazon Redshift N... Amazon S3 Exabyte-scale object storage Data Catalog Apache Hive Metastore
54 Running an analytic query over an exabyte in S3
55 Now let s run a query over an exabyte of data in S3 Roughly 140 TB of customer item order detail records for each day over past 20 years. 190 million files across 15,000 partitions in S3. One partition per day for USA and rest of world. Need a billion-fold reduction in data processed. Running this query using a 1000 node Hive cluster would take over 5 years.* Compression.....5X Columnar file format... 10X Scanning with 2500 nodes x Static partition elimination...2x Dynamic partition elimination...350x Redshift s query optimizer...40x Total reduction. 3.5B X * Estimated using 20 node Hive cluster & 1.4TB, assume linear * Query used a 20 node DC1.8XLarge Amazon Redshift cluster * Not actual sales data - generated for this demo based on data format used by Amazon Retail.
56 Amazon Redshift Spectrum is fast Leverages Amazon Redshift s advanced cost-based optimizer Pushes down projections, filters, aggregations and join reduction Dynamic partition pruning to minimize data processed Automatic parallelization of query execution against S3 data Efficient join processing within the Amazon Redshift cluster
57 Amazon Redshift Spectrum is cost-effective You pay for your Amazon Redshift cluster plus $5 per TB scanned from S3 Each query can leverage 1000s of Amazon Redshift Spectrum nodes You can reduce the TB scanned and improve query performance by: Partitioning data Using a columnar file format Compressing data
58 Amazon Redshift Spectrum is secure End-to-end data encryption Encrypt S3 data using SSE and AWS KMS Encrypt all Amazon Redshift data using KMS, AWS CloudHSM or your on-premises HSMs Audit logging All API calls are logged using AWS CloudTrail All SQL statements are logged within Amazon Redshift Enforce SSL with perfect forward encryption using ECDHE Virtual private cloud Alerts & notifications Amazon Redshift leader node in your VPC. Compute nodes in private VPC. Spectrum nodes in private VPC, store no state. Communicate event-specific notifications via , text message, or call with Amazon SNS Certifications & compliance SOC1/2/3 FedRAMP HIPAA/BAA PCI/DSS
59 Amazon Redshift Spectrum uses standard SQL Redshift Spectrum seamlessly integrates with your existing SQL & BI apps Support for complex joins, nested queries & window functions Support for data partitioned in S3 by any key Date, Time and any other custom keys e.g., Year, Month, Day, Hour
60 Defining External Schema and Creating Tables Define an external schema in Amazon Redshift using the Amazon Athena data catalog or your own Apache Hive Metastore CREATE EXTERNAL SCHEMA <schema_name> Query external tables using <schema_name>.<table_name> Register external tables using Athena, your Hive Metastore client, or from Amazon Redshift CREATE EXTERNAL TABLE syntax CREATE EXTERNAL TABLE <table_name> [PARTITIONED BY <column_name, data_type, >] STORED AS file_format LOCATION s3_location [TABLE PROPERTIES property_name=property_value, ];
61 Amazon Redshift Spectrum Current support File formats Parquet CSV Sequence RCFile ORC (coming soon) RegExSerDe (coming soon) Compression Gzip Snappy Lzo (coming soon) Bz2 Encryption SSE with AES256 SSE KMS with default key Column types Table type Numeric: bigint, int, smallint, float, double and decimal Char/varchar/string Timestamp Boolean DATE type can be used only as a partitioning key Non-partitioned table (s3://mybucket/orders/..) Partitioned table (s3://mybucket/orders/date=yyyy-mm- DD/..)
62 Converting to Parquet and ORC using Amazon EMR You can use Hive CREATE TABLE AS SELECT to convert data CREATE TABLE data_converted STORED AS PARQUET AS SELECT col_1, col2, col3 FROM data_source Or use Spark - 20 lines of Pyspark code, running on Amazon EMR 1TB of text data reduced to 130 GB in Parquet format with snappy compression Total cost of EMR job to do this: $5
63 Is Amazon Redshift Spectrum useful if I don t have an exabyte? Your data will get bigger On average, data warehousing volumes grow 10x every 5 years The average Amazon Redshift customer doubles data each year Amazon Redshift Spectrum makes data analysis simpler Access your data without ETL pipelines Teams using Amazon EMR, Athena & Redshift can collaborate using the same data lake Amazon Redshift Spectrum improves availability and concurrency Run multiple Amazon Redshift clusters against common data Isolate jobs with tight SLAs from ad hoc analysis
64 Over 20 customers helped preview Amazon Redshift Spectrum
65 The Emerging Analytics Architecture Storage Amazon S3 Exabyte-scale Object Storage AWS Glue Data Catalog Hive-compatible Metastore Serverless Compute Amazon Kinesis Firehose Real-Time Data Streaming AWS Glue ETL & Data Catalog Amazon Redshift Spectrum Exabyte scale AWS Lambda Trigger-based Code Execution Data Processing Amazon EMR Managed Hadoop Applications Amazon Redshift Petabyte-scale Data Warehousing Amazon Athena Athena Interactive Query
66 Resources Amazon Redshift Engineering s Advanced Table Design Playbook Admin scripts Collection of utilities for running diagnostics on your cluster Admin views Collection of utilities for managing your cluster, generating schema DDL, etc. ColumnEncodingUtility Gives you the ability to apply optimal column encoding to an established schema with data already loaded
67 Thank You!
Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect
Intro to Big Data on AWS Igor Roiter Big Data Cloud Solution Architect Igor Roiter Big Data Cloud Solution Architect Working as a Data Specialist for the last 11 years 9 of them as a Consultant specializing
More informationBig Data on AWS. Big Data Agility and Performance Delivered in the Cloud. 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Big Data on AWS Big Data Agility and Performance Delivered in the Cloud 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Big Data Technologies and techniques for working productively
More informationCloud Analytics and Business Intelligence on AWS
Cloud Analytics and Business Intelligence on AWS Enterprise Applications Virtual Desktops Sharing & Collaboration Platform Services Analytics Hadoop Real-time Streaming Data Machine Learning Data Warehouse
More informationAgenda. AWS Database Services Traditional vs AWS Data services model Amazon RDS Redshift DynamoDB ElastiCache
Databases on AWS 2017 Amazon Web Services, Inc. and its affiliates. All rights served. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon Web Services,
More informationBuilding Big Data Storage Solutions (Data Lakes) for Maximum Flexibility. AWS Whitepaper
Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility AWS Whitepaper Building Big Data Storage Solutions (Data Lakes) for Maximum Flexibility: AWS Whitepaper Copyright 2018 Amazon Web
More informationIntroduction to Database Services
Introduction to Database Services Shaun Pearce AWS Solutions Architect 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Today s agenda Why managed database services? A non-relational
More informationWhat s New at AWS? A selection of some new stuff. Constantin Gonzalez, Principal Solutions Architect, Amazon Web Services
What s New at AWS? A selection of some new stuff Constantin Gonzalez, Principal Solutions Architect, Amazon Web Services Speed of Innovation AWS Pace of Innovation AWS has been continually expanding its
More informationServerless Computing. Redefining the Cloud. Roger S. Barga, Ph.D. General Manager Amazon Web Services
Serverless Computing Redefining the Cloud Roger S. Barga, Ph.D. General Manager Amazon Web Services Technology Triggers Highly Recommended http://a16z.com/2016/12/16/the-end-of-cloud-computing/ Serverless
More informationData Lake Best Practices
Data Lake Best Practices Agenda Why Data Lake Key Components of a Data Lake Modern Data Architecture Some Best Practices Case Study Summary Takeaways What is a Data Lake? What, why etc. What is a data
More informationAWS 101. Patrick Pierson, IonChannel
AWS 101 Patrick Pierson, IonChannel What is AWS? Amazon Web Services (AWS) is a secure cloud services platform, offering compute power, database storage, content delivery and other functionality to help
More informationAurora, RDS, or On-Prem, Which is right for you
Aurora, RDS, or On-Prem, Which is right for you Kathy Gibbs Database Specialist TAM Katgibbs@amazon.com Santa Clara, California April 23th 25th, 2018 Agenda RDS Aurora EC2 On-Premise Wrap-up/Recommendation
More informationIntegrating Splunk with AWS services:
Integrating Splunk with AWS services: Using Redshi+, Elas0c Map Reduce (EMR), Amazon Machine Learning & S3 to gain ac0onable insights via predic0ve analy0cs via Splunk Patrick Shumate Solutions Architect,
More informationManaging IoT and Time Series Data with Amazon ElastiCache for Redis
Managing IoT and Time Series Data with ElastiCache for Redis Darin Briskman, ElastiCache Developer Outreach Michael Labib, Specialist Solutions Architect 2016, Web Services, Inc. or its Affiliates. All
More informationLambda Architecture for Batch and Stream Processing. October 2018
Lambda Architecture for Batch and Stream Processing October 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only.
More informationBest Practices and Performance Tuning on Amazon Elastic MapReduce
Best Practices and Performance Tuning on Amazon Elastic MapReduce Michael Hanisch Solutions Architect Amo Abeyaratne Big Data and Analytics Consultant ANZ 12.04.2016 2016, Amazon Web Services, Inc. or
More informationARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS
ARCHITECTING WEB APPLICATIONS FOR THE CLOUD: DESIGN PRINCIPLES AND PRACTICAL GUIDANCE FOR AWS Dr Adnene Guabtni, Senior Research Scientist, NICTA/Data61, CSIRO Adnene.Guabtni@csiro.au EC2 S3 ELB RDS AMI
More informationCorriendo R sobre un ambiente Serverless: Amazon Athena
Corriendo R sobre un ambiente Serverless: Amazon Athena Mauricio Muñoz Solutions Architect, AWS Chile April, 2017 2015, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Web Services
More informationGabriel Villa. Architecting an Analytics Solution on AWS
Gabriel Villa Architecting an Analytics Solution on AWS Cloud and Data Architect Skilled leader, solution architect, and technical expert focusing primarily on Microsoft technologies and AWS. Passionate
More informationStore, Protect, Optimize Your Healthcare Data in AWS
Healthcare reform, increasing patient expectations, exponential data growth, and the threat of cyberattacks are forcing healthcare providers to re-evaluate their data management strategies. Healthcare
More informationCloud Computing & Visualization
Cloud Computing & Visualization Workflows Distributed Computation with Spark Data Warehousing with Redshift Visualization with Tableau #FIUSCIS School of Computing & Information Sciences, Florida International
More informationWhat s New at AWS? looking at just a few new things for Enterprise. Philipp Behre, Enterprise Solutions Architect, Amazon Web Services
What s New at AWS? looking at just a few new things for Enterprise Philipp Behre, Enterprise Solutions Architect, Amazon Web Services 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
More informationData Analytics at Logitech Snowflake + Tableau = #Winning
Welcome # T C 1 8 Data Analytics at Logitech Snowflake + Tableau = #Winning Avinash Deshpande I am a futurist, scientist, engineer, designer, data evangelist at heart Find me at Avinash Deshpande Chief
More informationCIT 668: System Architecture. Amazon Web Services
CIT 668: System Architecture Amazon Web Services Topics 1. AWS Global Infrastructure 2. Foundation Services 1. Compute 2. Storage 3. Database 4. Network 3. AWS Economics Amazon Services Architecture Regions
More informationAbout Intellipaat. About the Course. Why Take This Course?
About Intellipaat Intellipaat is a fast growing professional training provider that is offering training in over 150 most sought-after tools and technologies. We have a learner base of 600,000 in over
More informationDURATION : 03 DAYS. same along with BI tools.
AWS REDSHIFT TRAINING MILDAIN DURATION : 03 DAYS To benefit from this Amazon Redshift Training course from mildain, you will need to have basic IT application development and deployment concepts, and good
More informationEnergy Management with AWS
Energy Management with AWS Kyle Hart and Nandakumar Sreenivasan Amazon Web Services August [XX], 2017 Tampa Convention Center Tampa, Florida What is Cloud? The NIST Definition Broad Network Access On-Demand
More informationTowards a Real- time Processing Pipeline: Running Apache Flink on AWS
Towards a Real- time Processing Pipeline: Running Apache Flink on AWS Dr. Steffen Hausmann, Solutions Architect Michael Hanisch, Manager Solutions Architecture November 18 th, 2016 Stream Processing Challenges
More information4) An organization needs a data store to handle the following data types and access patterns:
1) A company needs to deploy a data lake solution for their data scientists in which all company data is accessible and stored in a central S3 bucket. The company segregates the data by business unit,
More informationBig Data Technology Ecosystem. Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara
Big Data Technology Ecosystem Mark Burnette Pentaho Director Sales Engineering, Hitachi Vantara Agenda End-to-End Data Delivery Platform Ecosystem of Data Technologies Mapping an End-to-End Solution Case
More informationAmazon Search Services. Christoph Schmitter
Amazon Search Services Christoph Schmitter csc@amazon.de What we'll cover Overview of Amazon Search Services Understand the difference between Cloudsearch and Amazon ElasticSearch Service Q&A Amazon Search
More informationOverview of AWS Security - Database Services
Overview of AWS Security - Database Services June 2016 (Please consult http://aws.amazon.com/security/ for the latest version of this paper) 2016, Amazon Web Services, Inc. or its affiliates. All rights
More informationPrepAwayExam. High-efficient Exam Materials are the best high pass-rate Exam Dumps
PrepAwayExam http://www.prepawayexam.com/ High-efficient Exam Materials are the best high pass-rate Exam Dumps Exam : SAA-C01 Title : AWS Certified Solutions Architect - Associate (Released February 2018)
More informationAmazon AWS-Solution-Architect-Associate Exam
Volume: 858 Questions Question: 1 You are trying to launch an EC2 instance, however the instance seems to go into a terminated status immediately. What would probably not be a reason that this is happening?
More informationHow can you implement this through a script that a scheduling daemon runs daily on the application servers?
You ve been tasked with implementing an automated data backup solution for your application servers that run on Amazon EC2 with Amazon EBS volumes. You want to use a distributed data store for your backups
More informationDeep Dive Amazon Kinesis. Ian Meyers, Principal Solution Architect - Amazon Web Services
Deep Dive Amazon Kinesis Ian Meyers, Principal Solution Architect - Amazon Web Services Analytics Deployment & Administration App Services Analytics Compute Storage Database Networking AWS Global Infrastructure
More informationLambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL. May 2015
Lambda Architecture for Batch and Real- Time Processing on AWS with Spark Streaming and Spark SQL May 2015 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document
More informationApache Hive for Oracle DBAs. Luís Marques
Apache Hive for Oracle DBAs Luís Marques About me Oracle ACE Alumnus Long time open source supporter Founder of Redglue (www.redglue.eu) works for @redgluept as Lead Data Architect @drune After this talk,
More informationBERLIN. 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved
BERLIN 2015, Amazon Web Services, Inc. or its affiliates. All rights reserved Amazon Aurora: Amazon s New Relational Database Engine Carlos Conde Technology Evangelist @caarlco 2015, Amazon Web Services,
More informationContainers or Serverless? Mike Gillespie Solutions Architect, AWS Solutions Architecture
Containers or Serverless? Mike Gillespie Solutions Architect, AWS Solutions Architecture A Typical Application with Microservices Client Webapp Webapp Webapp Greeting Greeting Greeting Name Name Name Microservice
More informationReal-time Streaming Applications on AWS Patterns and Use Cases
Real-time Streaming Applications on AWS Patterns and Use Cases Paul Armstrong - Solutions Architect (AWS) Tom Seddon - Data Engineering Tech Lead (Deliveroo) 28 th June 2017 2016, Amazon Web Services,
More informationOverview. Prerequisites. Course Outline. Course Outline :: Apache Spark Development::
Title Duration : Apache Spark Development : 4 days Overview Spark is a fast and general cluster computing system for Big Data. It provides high-level APIs in Scala, Java, Python, and R, and an optimized
More informationSplunk & AWS. Gain real-time insights from your data at scale. Ray Zhu Product Manager, AWS Elias Haddad Product Manager, Splunk
Splunk & AWS Gain real-time insights from your data at scale Ray Zhu Product Manager, AWS Elias Haddad Product Manager, Splunk Forward-Looking Statements During the course of this presentation, we may
More informationAWS Solution Architect Associate
AWS Solution Architect Associate 1. Introduction to Amazon Web Services Overview Introduction to Cloud Computing History of Amazon Web Services Why we should Care about Amazon Web Services Overview of
More informationArchitectural challenges for building a low latency, scalable multi-tenant data warehouse
Architectural challenges for building a low latency, scalable multi-tenant data warehouse Mataprasad Agrawal Solutions Architect, Services CTO 2017 Persistent Systems Ltd. All rights reserved. Our analytics
More informationWhat to expect from the session Technical recap VMware Cloud on AWS {Sample} Integration use case Services introduction & solution designs Solution su
LHC3376BES AWS Native Services Integration with VMware Cloud on AWS Technical Deep Dive Ian Massingham, Worldwide Lead, AWS Technical Evangelism Paul Bockelman, AWS Principal Solutions Architect (WWPS)
More informationAsanka Padmakumara. ETL 2.0: Data Engineering with Azure Databricks
Asanka Padmakumara ETL 2.0: Data Engineering with Azure Databricks Who am I? Asanka Padmakumara Business Intelligence Consultant, More than 8 years in BI and Data Warehousing A regular speaker in data
More informationAn Introduction to Big Data Formats
Introduction to Big Data Formats 1 An Introduction to Big Data Formats Understanding Avro, Parquet, and ORC WHITE PAPER Introduction to Big Data Formats 2 TABLE OF TABLE OF CONTENTS CONTENTS INTRODUCTION
More informationWHITEPAPER. MemSQL Enterprise Feature List
WHITEPAPER MemSQL Enterprise Feature List 2017 MemSQL Enterprise Feature List DEPLOYMENT Provision and deploy MemSQL anywhere according to your desired cluster configuration. On-Premises: Maximize infrastructure
More information2013 AWS Worldwide Public Sector Summit Washington, D.C.
2013 AWS Worldwide Public Sector Summit Washington, D.C. EMR for Fun and for Profit Ben Butler Sr. Manager, Big Data butlerb@amazon.com @bensbutler Overview 1. What is big data? 2. What is AWS Elastic
More informationApril Copyright 2013 Cloudera Inc. All rights reserved.
Hadoop Beyond Batch: Real-time Workloads, SQL-on- Hadoop, and the Virtual EDW Headline Goes Here Marcel Kornacker marcel@cloudera.com Speaker Name or Subhead Goes Here April 2014 Analytic Workloads on
More informationAmazon Web Services (AWS) Solutions Architect Intermediate Level Course Content
Amazon Web Services (AWS) Solutions Architect Intermediate Level Course Content Introduction to Cloud Computing A Short history Client Server Computing Concepts Challenges with Distributed Computing Introduction
More informationAt Course Completion Prepares you as per certification requirements for AWS Developer Associate.
[AWS-DAW]: AWS Cloud Developer Associate Workshop Length Delivery Method : 4 days : Instructor-led (Classroom) At Course Completion Prepares you as per certification requirements for AWS Developer Associate.
More informationPart 1: Indexes for Big Data
JethroData Making Interactive BI for Big Data a Reality Technical White Paper This white paper explains how JethroData can help you achieve a truly interactive interactive response time for BI on big data,
More informationAWS Serverless Architecture Think Big
MAKING BIG DATA COME ALIVE AWS Serverless Architecture Think Big Garrett Holbrook, Data Engineer Feb 1 st, 2017 Agenda What is Think Big? Example Project Walkthrough AWS Serverless 2 Think Big, a Teradata
More informationAWS Storage Optimization. AWS Whitepaper
AWS Storage Optimization AWS Whitepaper AWS Storage Optimization: AWS Whitepaper Copyright 2018 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress
More informationStages of Data Processing
Data processing can be understood as the conversion of raw data into a meaningful and desired form. Basically, producing information that can be understood by the end user. So then, the question arises,
More informationShark: Hive (SQL) on Spark
Shark: Hive (SQL) on Spark Reynold Xin UC Berkeley AMP Camp Aug 21, 2012 UC BERKELEY SELECT page_name, SUM(page_views) views FROM wikistats GROUP BY page_name ORDER BY views DESC LIMIT 10; Stage 0: Map-Shuffle-Reduce
More informationAmazon Web Services. Block 402, 4 th Floor, Saptagiri Towers, Above Pantaloons, Begumpet Main Road, Hyderabad Telangana India
(AWS) Overview: AWS is a cloud service from Amazon, which provides services in the form of building blocks, these building blocks can be used to create and deploy various types of application in the cloud.
More informationActivator Library. Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success.
Focus on maximizing the value of your data, gain business insights, increase your team s productivity, and achieve success. ACTIVATORS Designed to give your team assistance when you need it most without
More informationLevel Up Your CF Apps with Amazon Web Services
Level Up Your CF Apps with Amazon Web Services Brian Klaas bklaas@jhu.edu @brian_klaas Level Up Your CF Apps with Amazon Web Services Brian Klaas bklaas@jhu.edu @brian_klaas Hello Hello Hello Hello Hello
More informationQLIK INTEGRATION WITH AMAZON REDSHIFT
QLIK INTEGRATION WITH AMAZON REDSHIFT Qlik Partner Engineering Created August 2016, last updated March 2017 Contents Introduction... 2 About Amazon Web Services (AWS)... 2 About Amazon Redshift... 2 Qlik
More informationCrypto-Options on AWS. Bertram Dorn Specialized Solutions Architect Security/Compliance Network/Databases Amazon Web Services Germany GmbH
Crypto-Options on AWS Bertram Dorn Specialized Solutions Architect Security/Compliance Network/Databases Amazon Web Services Germany GmbH Amazon.com, Inc. and its affiliates. All rights reserved. Agenda
More informationEnroll Now to Take online Course Contact: Demo video By Chandra sir
Enroll Now to Take online Course www.vlrtraining.in/register-for-aws Contact:9059868766 9985269518 Demo video By Chandra sir www.youtube.com/watch?v=8pu1who2j_k Chandra sir Class 01 https://www.youtube.com/watch?v=fccgwstm-cc
More informationCloudExpo November 2017 Tomer Levi
CloudExpo November 2017 Tomer Levi About me Full Stack Engineer @ Intel s Advanced Analytics group. Artificial Intelligence unit at Intel. Responsible for (1) Radical improvement of critical processes
More informationExperiences with Serverless Big Data
Experiences with Serverless Big Data AWS Meetup Munich 2016 Markus Schmidberger, Head of Data Service Munich, 17.10.16 Key Components of our Data Service Real-Time Monitoring Enable our development teams
More informationStreaming Data: The Opportunity & How to Work With It
Streaming Data: The Opportunity & How to Work With It Roger Barga, GM Amazon Kinesis April 2016 2016, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Interest in and demand for stream
More informationOracle WebLogic Server 12c on AWS. December 2018
Oracle WebLogic Server 12c on AWS December 2018 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Notices This document is provided for informational purposes only. It represents
More informationWe are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info
We are ready to serve Latest IT Trends, Are you ready to learn? New Batches Info START DATE : TIMINGS : DURATION : TYPE OF BATCH : FEE : FACULTY NAME : LAB TIMINGS : Storage & Database Services : Introduction
More informationData 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp.
17-18 March, 2018 Beijing Data 101 Which DB, When Joe Yong Sr. Program Manager Microsoft Corp. The world is changing AI increased by 300% in 2017 Data will grow to 44 ZB in 2020 Today, 80% of organizations
More informationWerden Sie ein Teil von Internet der Dinge auf AWS. AWS Enterprise Summit 2015 Dr. Markus Schmidberger -
Werden Sie ein Teil von Internet der Dinge auf AWS AWS Enterprise Summit 2015 Dr. Markus Schmidberger - schmidbe@amazon.de Internet of Things is the network of physical objects or "things" embedded with
More informationAWS IoT Overview. July 2016 Thomas Jones, Partner Solutions Architect
AWS IoT Overview July 2016 Thomas Jones, Partner Solutions Architect AWS customers are connecting physical things to the cloud in every industry imaginable. Healthcare and Life Sciences Municipal Infrastructure
More informationBetter, Faster, Stronger web apps with Amazon Web Services. Senior Technology Evangelist, Amazon Web Services
Better, Faster, Stronger web apps with Amazon Web Services Simone Brunozzi ( @simon ) Senior Technology Evangelist, Amazon Web Services (from the previous presentation) Knowledge starts from great questions.
More informationAWS Solutions Architect Associate (SAA-C01) Sample Exam Questions
1) A company is storing an access key (access key ID and secret access key) in a text file on a custom AMI. The company uses the access key to access DynamoDB tables from instances created from the AMI.
More informationModern Data Warehouse The New Approach to Azure BI
Modern Data Warehouse The New Approach to Azure BI History On-Premise SQL Server Big Data Solutions Technical Barriers Modern Analytics Platform On-Premise SQL Server Big Data Solutions Modern Analytics
More informationMODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS
MODERN BIG DATA DESIGN PATTERNS CASE DRIVEN DESINGS SUJEE MANIYAM FOUNDER / PRINCIPAL @ ELEPHANT SCALE www.elephantscale.com sujee@elephantscale.com HI, I M SUJEE MANIYAM Founder / Principal @ ElephantScale
More informationAWS Services for Data Migration Luke Anderson Head of Storage, AWS APAC
AWS Services for Data Migration Luke Anderson Head of Storage, AWS APAC Offline Online Complete Set Of Data Building Blocks Data movement AWS Storage Gateway Family Data security and management Amazon
More informationExtend NonStop Applications with Cloud-based Services. Phil Ly, TIC Software John Russell, Canam Software
Extend NonStop Applications with Cloud-based Services Phil Ly, TIC Software John Russell, Canam Software Agenda Cloud Computing and Microservices Amazon Web Services (AWS) Integrate NonStop with AWS Managed
More informationAutonomous Database Level 100
Autonomous Database Level 100 Sanjay Narvekar December 2018 1 Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and
More informationSecurity & Compliance in the AWS Cloud. Vijay Rangarajan Senior Cloud Architect, ASEAN Amazon Web
Security & Compliance in the AWS Cloud Vijay Rangarajan Senior Cloud Architect, ASEAN Amazon Web Services @awscloud www.cloudsec.com #CLOUDSEC Security & Compliance in the AWS Cloud TECHNICAL & BUSINESS
More informationAWS Storage Gateway. Not your father s hybrid storage. University of Arizona IT Summit October 23, Jay Vagalatos, AWS Solutions Architect
AWS Storage Gateway Not your father s hybrid storage University of Arizona IT Summit 2017 Jay Vagalatos, AWS Solutions Architect October 23, 2017 The AWS Storage Portfolio Amazon EBS (persistent) Block
More informationSAA-C01. AWS Solutions Architect Associate. Exam Summary Syllabus Questions
SAA-C01 AWS Solutions Architect Associate Exam Summary Syllabus Questions Table of Contents Introduction to SAA-C01 Exam on AWS Solutions Architect Associate... 2 AWS SAA-C01 Certification Details:...
More informationHadoop 2.x Core: YARN, Tez, and Spark. Hortonworks Inc All Rights Reserved
Hadoop 2.x Core: YARN, Tez, and Spark YARN Hadoop Machine Types top-of-rack switches core switch client machines have client-side software used to access a cluster to process data master nodes run Hadoop
More informationAWS Administration. Suggested Pre-requisites Basic IT Knowledge
Course Description Amazon Web Services Administration (AWS Administration) course starts your Cloud Journey. If you are planning to learn Cloud Computing and Amazon Web Services in particular, then this
More informationData Architectures in Azure for Analytics & Big Data
Data Architectures in for Analytics & Big Data October 20, 2018 Melissa Coates Solution Architect, BlueGranite Microsoft Data Platform MVP Blog: www.sqlchick.com Twitter: @sqlchick Data Architecture A
More informationDATABASE SCALE WITHOUT LIMITS ON AWS
The move to cloud computing is changing the face of the computer industry, and at the heart of this change is elastic computing. Modern applications now have diverse and demanding requirements that leverage
More informationAWS Storage Gateway. Amazon S3. Amazon EFS. Amazon Glacier. Amazon EBS. Amazon EC2 Instance. storage. File Block Object. Hybrid integrated.
AWS Storage Amazon EFS Amazon EBS Amazon EC2 Instance storage Amazon S3 Amazon Glacier AWS Storage Gateway File Block Object Hybrid integrated storage Amazon S3 Amazon Glacier Amazon EBS Amazon EFS Durable
More informationAmazon Athena: User Guide
Amazon Athena User Guide Amazon Athena: User Guide Copyright 2018 Amazon Web Services, Inc. and/or its affiliates. All rights reserved. Amazon's trademarks and trade dress may not be used in connection
More informationMONITORING SERVERLESS ARCHITECTURES
MONITORING SERVERLESS ARCHITECTURES CAN YOU HELP WITH SOME PRODUCTION PROBLEMS? Your Manager (CC) Rachel Gardner Rafal Gancarz Lead Consultant @ OpenCredo WHAT IS SERVERLESS? (CC) theaucitron Cloud-native
More informationSTATE OF MODERN APPLICATIONS IN THE CLOUD
STATE OF MODERN APPLICATIONS IN THE CLOUD 2017 Introduction The Rise of Modern Applications What is the Modern Application? Today s leading enterprises are striving to deliver high performance, highly
More informationTraining on Amazon AWS Cloud Computing. Course Content
Training on Amazon AWS Cloud Computing Course Content 15 Amazon Web Services (AWS) Cloud Computing 1) Introduction to cloud computing Introduction to Cloud Computing Why Cloud Computing? Benefits of Cloud
More informationPass4test Certification IT garanti, The Easy Way!
Pass4test Certification IT garanti, The Easy Way! http://www.pass4test.fr Service de mise à jour gratuit pendant un an Exam : SOA-C01 Title : AWS Certified SysOps Administrator - Associate Vendor : Amazon
More informationAmazon Web Services (AWS) Training Course Content
Amazon Web Services (AWS) Training Course Content SECTION 1: CLOUD COMPUTING INTRODUCTION History of Cloud Computing Concept of Client Server Computing Distributed Computing and it s Challenges What is
More informationAmazon Web Services Training. Training Topics:
Amazon Web Services Training Training Topics: SECTION1: INTRODUCTION TO CLOUD COMPUTING A Short history Client Server Computing Concepts Challenges with Distributed Computing Introduction to Cloud Computing
More informationAmazon Web Services. For Government, Education, and Nonprofit Organizations
Amazon Web Services For Government, Education, and Nonprofit Organizations Max Peterson GM EMEA, LATAM and Global Contracts maxpete@amazon.co.uk +44 (0)7342 079563 2015, Amazon Web Services, Inc. or its
More informationResearch at PNNL: Powered by AWS NLIT 2018
Research at PNNL: Powered by AWS NLIT 2018 RALPH PERKO AND MIKE GIARDINELLI Pacific Northwest National Laboratory Reference herein to any specific commercial product, process, or service by trade name,
More informationBig Data. Big Data Analyst. Big Data Engineer. Big Data Architect
Big Data Big Data Analyst INTRODUCTION TO BIG DATA ANALYTICS ANALYTICS PROCESSING TECHNIQUES DATA TRANSFORMATION & BATCH PROCESSING REAL TIME (STREAM) DATA PROCESSING Big Data Engineer BIG DATA FOUNDATION
More informationMapR Enterprise Hadoop
2014 MapR Technologies 2014 MapR Technologies 1 MapR Enterprise Hadoop Top Ranked Cloud Leaders 500+ Customers 2014 MapR Technologies 2 Key MapR Advantage Partners Business Services APPLICATIONS & OS ANALYTICS
More informationSecurity & Compliance in the AWS Cloud. Amazon Web Services
Security & Compliance in the AWS Cloud Amazon Web Services Our Culture Simple Security Controls Job Zero AWS Pace of Innovation AWS has been continually expanding its services to support virtually any
More informationAWS Solutions Architect Exam Tips
AWS Solutions Architect Exam Tips This is not a brain dump! Questions and Answers are not given here, rather guidelines for further research, reviewing the Architecting on AWS courseware and AWS documentation.
More informationMicroservices on AWS. Matthias Jung, Solutions Architect AWS
Microservices on AWS Matthias Jung, Solutions Architect AWS Agenda What are Microservices? Why Microservices? Challenges of Microservices Microservices on AWS What are Microservices? What are Microservices?
More information