DIY Hosting for Online Privacy Shoumik Palkar and Matei Zaharia Stanford University Appeared at HotNets 2017
Before: A Federated Internet The Internet and its protocols were designed to be federated Organizations would host own email, chat, and file transfer servers and manage their own data!
Today: The Era of Centralized Services Centralized services store data for organization. Organizations trade control of data for high availability at low cost Highly Available Centralized Service (e.g., Gmail, Slack, Office 365)
Why Do We Use Centralized Services? They provide high availability at low cost + Failover Configuration + Geo-replication + Auto-scaling + etc. etc. etc. Strawman: Hosting your own tiny EC2 VM costs $4.50/month High availability costs even more
A New Hope: Serverless Computing Serverless computing: The availability of a top-tier cloud provider, but zero cost when idle Functions that run only when request is made, billed at 100 ms granularity Monthly Cost ($) 6 5 4 3 2 1 0 Most users are here. What does this mean? Lambda EC2 0 1500000 3000000 Monthly Requests
Deploy It Yourself: Taking Back the Internet Users run personal web applications using serverless computing platforms. High availability, low cost, and privacy for the first time.
Deploy It Yourself (DIY) Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data
Deploy It Yourself (DIY) Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data 1. Register Serverless Function
Deploy It Yourself (DIY) Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data 2. Configure a cloud storage provider
Deploy It Yourself (DIY) Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data 3. Register Key with a Key Service
Deploy It Yourself (DIY) Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data
Why is DIY More Secure? 1. Narrow boundary between data and service vs. centralized service: many internal systems can access user data 2. Stored data is encrypted to prevent leaks vs. centralized service: employees access data to monetize it. 3. Cloud providers minimize data access internally vs. centralized service: EULAs state data can be used for ad targeting, etc. etc. 4. Ability to migrate data off insecure clouds and regions vs. centralized service: generally, no control over where data lives.
Why is DIY More Secure*? 1. Narrow boundary between data and service vs. centralized service: many internal systems can access user data 2. Stored data is encrypted to prevent leaks vs. centralized service: employees access data to monetize it. 3. Cloud providers minimize data access internally. vs. centralized service: EULAs state data can be used for ad targeting, etc. etc. 4. Ability to migrate data off insecure clouds and regions vs. centralized service: generally, no control over where data lives. *Assumes the function code, isolation mechanisms, and key service are trusted.
Threat Model Trusted
Threat Model Trusted Serverless Computing Platform Isolation Function containers must hide execution and function state* *Could one day be attested and secured using hardware enclaves?
Threat Model Trusted Serverless Computing Platform Isolation Key Management Service Protecting access to users keys* *Management services already secured via enclaves today, have strict EULAs
Threat Model Trusted Serverless Computing Platform Isolation Key Management Service Function Code Function code must not leak data or have critical bugs
Threat Model Trusted Serverless Computing Platform Isolation Key Management Service Function Code Untrusted Internal Network Storage service and other cloud services Internet traffic between user and cloud provider
DIY Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data Trusted Components
DIY Architecture Serverless Platform Key Service Email Key Load Balancer Storage Service Encrypted user data Simple enough to be secured via hardware enclaves
What DIY Protects Against Snooping employees Data mining and sale Buggy or insecure software Government Surveillance J J K L
Rest of this Talk 1. Back-of-the-Envelope Costs 2. Chat Prototype and Challenges 3. A Marketplace for DIY
Back-of-the-Envelope Costs Application Daily Requests Compute / Request Memory Persistent Storage Monthly Cost Group Chat 2000 500 ms 128 MB 2 GB $0.14 Email 500 500 ms 128 MB 5 GB $0.21 File Transfer 100 2000 ms 1 GB 2 GB $0.14 IoT Control 100 500 ms 128 MB 1 GB $0.12 Video Chat* 1 15 min call 1.7 GB 1 GB $0.84 Comparison: un-replicated EC2 t2.nano server (500 MB, CPU burst only) = $4.50/month *On a billed-per-second VM.
Chat Prototype and Challenges Encrypted Storage Challenge 1: Asynchronous communication (reading messages without keeping Lambda running) HTTPS Endpoint HTTPS SQS SQS used to allow client polling without running Lambda function continuously. Challenge 2: Latency with Pay- Per-Request Storage Append small objects to S3.
Chat Prototype and Challenges Encrypted Storage 200ms Response Time. (Most time spent in reading from SQS queue and posting to S3) HTTPS Endpoint HTTPS SQS 25,000 messages/month at no cost. Including SQS and Lambda compute. + additional $0.09/mo. For storage
Bringing DIY Applications to Everyone Cloud provider manages: Installation Permissions/Signing Updates etc. etc. Available on the DIY App Store For Users Privacy with automatic low cost and availability For Developers Faster innovation: No need to manage a full multitenant scalable service
Conclusion DIY could revolutionize how we run web applications by offering privacy, high availability, and low cost for the first time. https://www.shoumik.xyz @sppalkia sppalkia shoumik@cs.stanford.edu
Related Work E2E Encrypted apps (e.g., Signal, WhatsApp) Don t support server side computation P2P Social Networks (e.g., Diaspora) Could be hosted on top of serverless platforms? No-trust cryptographic protocols (e.g., Dissent, Pung) Stronger security guarantees, but harder to deploy