AWS Storage Gateway Not your father s hybrid storage University of Arizona IT Summit 2017 Jay Vagalatos, AWS Solutions Architect October 23, 2017
The AWS Storage Portfolio Amazon EBS (persistent) Block Amazon EC2 Instance Store (ephemeral) Amazon EFS File Amazon S3 Object Amazon Glacier Cloud Data Migration Snow* data transport family Storage Gateway Direct Connect 3 rd Party Connectors Transfer Acceleration Kinesis Firehose
Hybrid storage use cases and architectures Enabling cloud workloads Move data to AWS storage for Big Data, cloud bursting, or migration Backup, archive, and disaster recovery Cost effective storage in AWS with local or cloud restore Tiered cloud storage Easily add AWS storage to your on-premises environment
Storage Gateway hybrid storage solutions Enables using standard storage protocols to access AWS storage services Files Amazon S3 Volumes Tapes AWS Storage Gateway Amazon Glacier Amazon EBS snapshots Amazon CloudWatch AWS Identity and Access Management (IAM) AWS CloudTrail AWS Key Management Service (KMS)
Storage Gateway Files, volumes, and tapes File gateway NFS (v3 and v4.1) interface On-premises file storage backed by Amazon S3 objects Volume gateway iscsi block interface On-premises block storage backed by S3 with EBS snapshots Tape gateway iscsi virtual tape library interface Virtual tape storage in Amazon S3 and Glacier with VTL management
Storage Gateway Common capabilities Standard storage protocols integrate with on-premises applications Local caching for low-latency access to frequently used data Efficient data transfer with buffering and bandwidth management Native data storage in AWS Stateless virtual appliance for resiliency Integrated with AWS management and security
File gateway On-premises file storage maintained as objects in Amazon S3 Customer Premises Application Server NFS v3 / v4.1 File Gateway HTTPS Data stored and retrieved from your S3 buckets One-to-one mapping from files-to-objects File metadata stored in object metadata S3 Standard Bucket access through IAM role you own and manage S3 Standard - Infrequent Access Use S3 Lifecycle Policies, versioning, or CRR to manage data Glacier
File Gateway CacheRefresh In-cloud workload S3 cross-region replication AWS Snowball Read-only NFS client Read-only NFS client AWS Snowball NFS client RefreshCache RefreshCache RefreshCache Job GETs/PUTs objects Storage Gateway Storage Gateway Storage Gateway Amazon EMR S3 Bucket S3 Bucket Cross region replication S3 Bucket S3 Bucket
Volume Gateway On-premises volume storage backed by Amazon S3 with EBS snapshots Customer Premises iscsi HTTPS Application Server Volume Gateway Storage Gateway bucket in Amazon S3 Amazon EBS snapshots Block storage in S3 accessed via the volume gateway Data compressed in-transit and at-rest Backup on-premises volumes to EBS snapshots Create on-premises volumes from EBS snapshots Up to 1PB of total volume storage per gateway
Volume Gateway Stored Primary data stored on-premises Asynchronous upload to AWS Point-in-time backups stored as Amazon EBS snapshots Up to 32 volumes, up to 16 TB each, for up to 512 TB per gateway Customer data center AWS Storage Gateway VM Application server INITIATOR TARGET Volume Storage Upload Buffer AWS Storage Gateway service Amazon EBS snapshots
Volume Gateway Cached Primary data stored in AWS Frequently accessed data cached on-premises Point-in-time backups stored as Amazon EBS snapshots Up to 32 volumes, up to 32 TB each, for up to 1 PB per gateway Customer data center AWS Storage Gateway VM Application server INITIATOR TARGET Cache Storage Upload Buffer AWS Storage Gateway service Volume storage backed by Amazon S3 Amazon EBS snapshots
Tape gateway Virtual tape storage in Amazon S3 and Glacier with VTL management Customer Premises Backup Server iscsi MEDIA CHANGER TAPE DRIVE Tape Gateway HTTPS Virtual Tapes stored in Amazon S3 Archived Tapes stored in Amazon Glacier Virtual tape storage in S3 and Glacier accessed via tape gateway Data compressed in-transit and at-rest Up to 1 PB total tape storage per gateway, unlimited archive capacity Supports leading backup applications:
Storage Gateway Key Benefits Seamless integration across standard storage protocols Low-latency access Durability, cost, and elasticity of AWS Storage services Efficient data transfer Data encryption Integrated with AWS monitoring, management, and security
Storage Gateway Pricing All gateway types File Volume Tape $0.01 per GB of data written to AWS* Files stored and billed by S3 $0.023 per GB-month of volume data stored Snapshots stored and billed by EBS $0.023 per GB-month of tape data stored $0.004 per GB-month of tape data archived $0.01 per GB of data retrieved from archive * Up to a maximum of $125/month. First 100GB per gateway free.
Getting Started Storage Gateway AWS Storage Gateway home page: aws.amazon.com/storagegateway AWS Storage Gateway documentation: aws.amazon.com/documentation/storage-gateway/ Ready to get started? AWS Storage Gateway console: console.aws.amazon.com/storagegateway/home
Thanks! Simple, Secure, Cost-effective Hybrid Storage in AWS
Under the hood
File Gateway File system metadata File system metadata persisted in object user-metadata, eg. x-amz-meta-file-permissions: 0666 x-amz-meta-file-user-class: 4321 x-amz-meta-file-group-class: 42 x-amz-meta-file-created: 2016-10-05T20:08:45+00:00 x-amz-meta-file-last-modified: 2016-10-05T20:08:45+00:00 Configurable defaults for objects that don t have this metadata E.g. objects that were already in the bucket Changing file metadata copies the object
How do I monitor cache performance? CloudWatch metrics metrics for gateway CachePercentUsed CacheHitPercent CachePercentDirty High CachePercentUsed is good Once 100% will start to impact CacheHitPercent May indicate the cache is too small for the working set Writes increase CachePercentDirty Reduces size of working cache, and can reduce CacheHitPercent May indicate data not uploading to AWS quickly enough
How does Storage Gateway transfer data to AWS? Dirty data asynchronously uploaded to AWS Byte-level parallel upload and download Compression for tape and volume Uploaded data committed to storage resource periodically File uses multi-part PUT Tape and volume create periodic recovery points (internal snapshots) Rate of writes by application determine periodicity of commit High write rates will commit more frequently Low write rates will commit based on timer
How do I monitor data transfer? CloudWatch metrics for gateway and per storage resource WriteBytes ReadBytes CloudBytesUploaded CloudBytesDownloaded Application server Storage Gateway For volumes TimeSinceLastRecoveryPoint indicates how long since the last commit
Monitoring and Security extended to on-premises gateways CloudTrail Logging of API Calls IAM Integration for File Gateway roles