Ascend: Architecture for Secure Computation on Encrypted Data Oblivious RAM (ORAM)

Similar documents
The Ascend Secure Processor. Christopher Fletcher MIT

Design and Implementation of the Ascend Secure Processor. Ling Ren, Christopher W. Fletcher, Albert Kwon, Marten van Dijk, Srinivas Devadas

Sanctum: Minimal HW Extensions for Strong SW Isolation

Onion ORAM: Constant Bandwidth ORAM Using Additively Homomorphic Encryption Ling Ren

Crypto Background & Concepts SGX Software Attestation

Searchable Encryption Using ORAM. Benny Pinkas

SGX Enclave Life Cycle Tracking TLB Flushes Security Guarantees

SGX Security Background. Masab Ahmad Department of Electrical and Computer Engineering University of Connecticut

arxiv: v1 [cs.ar] 4 Nov 2016

Freecursive ORAM: [Nearly] Free Recursion and Integrity Verification for Position-based Oblivious RAM

Protecting Private Data in the Cloud: A Path Oblivious RAM Protocol

Secure Remote Storage Using Oblivious RAM

RISCV with Sanctum Enclaves. Victor Costan, Ilia Lebedev, Srini Devadas

Design space exploration and optimization of path oblivious RAM in secure processors

A HIGH-PERFORMANCE OBLIVIOUS RAM CONTROLLER ON THE CONVEY HC-2EX HETEROGENEOUS COMPUTING PLATFORM

CSC 5930/9010 Cloud S & P: Cloud Primitives

Computer Architecture Background

Design Space Exploration and Optimization of Path Oblivious RAM in Secure Processors

Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen. Department of Electrical & Computer Engineering University of Connecticut

Memory Defenses. The Elevation from Obscurity to Headlines. Rajeev Balasubramonian School of Computing, University of Utah

Privacy-Preserving Computation with Trusted Computing via Scramble-then-Compute

Asymptotically Tight Bounds for Composing ORAM with PIR

Exploring Timing Side-channel Attacks on Path-ORAMs

Efficient Memory Integrity Verification and Encryption for Secure Processors

Efficient Private Information Retrieval

Architectural Primitives for Secure Computation Platforms

Architecture- level Security Issues for main memory. Ali Shafiee

Cooperative Path-ORAM for Effective Memory Bandwidth Sharing in Server Settings

Securing Cloud Computations with Oblivious Primitives from Intel SGX

Secure DIMM: Moving ORAM Primitives Closer to Memory

Influential OS Research Security. Michael Raitza

BUILDING SECURE (CLOUD) APPLICATIONS USING INTEL S SGX

ZeroTrace: Oblivious Memory Primitives from Intel SGX

at the MASSACHUSETTS INSTITUTE OF TECHNOLOGY June 2016 Signature redacted

Hardware Enclave Attacks CS261

roram: Efficient Range ORAM with O(log 2 N) Locality

Eleos: Exit-Less OS Services for SGX Enclaves

6.857 L17. Secure Processors. Srini Devadas

Eindhoven University of Technology MASTER. Evolution of oblivious RAM schemes. Teeuwen, P.J.P. Award date: 2015

arxiv: v4 [cs.cr] 17 Feb 2017

ROTE: Rollback Protection for Trusted Execution

Bucket ORAM: Single Online Roundtrip, Constant Bandwidth Oblivious RAM

Oblivious Computation with Data Locality

Security and Privacy in Cloud Computing

TRESCCA Trustworthy Embedded Systems for Secure Cloud Computing

Multi-Client Oblivious RAM Secure Against Malicious Servers

Intel Software Guard Extensions

Scalable Architectural Support for Trusted Software

Raccoon: Closing Digital Side-Channels through Obfuscated Execution

Usable PIR. Network Security and Applied. Cryptography Laboratory.

ObfusMem: A Low-Overhead Access Obfuscation for Trusted Memories

Distributed OS Hermann Härtig Authenticated Booting, Remote Attestation, Sealed Memory aka Trusted Computing

Lecture Embedded System Security Introduction to Trusted Computing

Recursive ORAMs with Practical Constructions

M 2 R: ENABLING STRONGER PRIVACY IN MAPREDUCE COMPUTATION

Flicker: An Execution Infrastructure for TCB Minimization

Komodo: Using Verification to Disentangle Secure-Enclave Hardware from Software

Securing Cloud-assisted Services

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

Policy-Sealed Data: A New Abstraction for Building Trusted Cloud Services

Burst ORAM: Minimizing ORAM Response Times for Bursty Access Patterns

Leveraging Intel SGX to Create a Nondisclosure Cryptographic library

Industrial Feasibility of Private Information Retrieval

Refresher: Applied Cryptography

TWORAM: Efficient Oblivious RAM in Two Rounds with Applications to Searchable Encryption

Obliviate: A Data Oblivious File System for Intel SGX. Adil Ahmad Kyungtae Kim Muhammad Ihsanulhaq Sarfaraz Byoungyoung Lee

M 2 R: Enabling Stronger Privacy in MapReduce Computation

CRYPTOGRAPHIC PROTOCOLS: PRACTICAL REVOCATION AND KEY ROTATION

CSAIL. Computer Science and Artificial Intelligence Laboratory. Massachusetts Institute of Technology

GP-ORAM: A Generalized Partition ORAM

Certifying Program Execution with Secure Processors. Benjie Chen Robert Morris Laboratory for Computer Science Massachusetts Institute of Technology

HOP: Hardware makes Obfuscation Practical

Intel Software Guard Extensions (Intel SGX) Memory Encryption Engine (MEE) Shay Gueron

Massively Parallel Hardware Security Platform

Ring ORAM: Closing the Gap Between Small and Large Client Storage Oblivious RAM

Varys. Protecting SGX Enclaves From Practical Side-Channel Attacks. Oleksii Oleksenko, Bohdan Trach. Mark Silberstein

CIS 4360 Secure Computer Systems SGX

Ryoan: A Distributed Sandbox for Untrusted Computation on Secret Data

AEGIS Secure Processor

Lecture Embedded System Security Introduction to Trusted Computing

Demonstration Lecture: Cyber Security (MIT Department) Trusted cloud hardware and advanced cryptographic solutions. Andrei Costin

Lecture Embedded System Security Introduction to Trusted Computing

An Accountability Scheme for Oblivious RAMs

Authenticated Storage Using Small Trusted Hardware Hsin-Jung Yang, Victor Costan, Nickolai Zeldovich, and Srini Devadas

Racing in Hyperspace: Closing Hyper-Threading Side Channels on SGX with Contrived Data Races. CS 563 Young Li 10/31/18

Side Channels and Runtime Encryption Solutions with Intel SGX

Distributed OS Hermann Härtig Authenticated Booting, Remote Attestation, Sealed Memory aka Trusted Computing

Secure Boot and Remote Attestation in the Sanctum Processor

Secure Boot and Remote Attestation in the Sanctum Processor

MemCloak: Practical Access Obfuscation for Untrusted Memory

Iron: Functional encryption using Intel SGX

Lectures 6+7: Zero-Leakage Solutions

Practical Oblivious RAM and its Applications

Sub-logarithmic Distributed Oblivious RAM with Small Block Size

HaTCh: State-of-the-Art in Hardware Trojan Detection

Introduction to SGX (Software Guard Extensions) and SGX Virtualization. Kai Huang, Jun Nakajima (Speaker) July 12, 2017

INFLUENTIAL OPERATING SYSTEM RESEARCH: SECURITY MECHANISMS AND HOW TO USE THEM CARSTEN WEINHOLD

Jian Liu, Sara Ramezanian

Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware

The Next Steps in the Evolution of Embedded Processors

Transcription:

CSE 5095 & ECE 4451 & ECE 5451 Spring 2017 Lecture 7b Ascend: Architecture for Secure Computation on Encrypted Data Oblivious RAM (ORAM) Marten van Dijk Syed Kamran Haider, Chenglu Jin, Phuong Ha Nguyen Department of Electrical & Computer Engineering University of Connecticut

Context: Cloud Computing User sends his/her sensitive data to the cloud, trusting it to correctly do the computation and also not expose user sensitive data to anyone. Example: search engine, medical/patent database Cloud software is buggy and potentially malicious x, P P(x)

Threat Model The cloud itself is malicious Server can expose x to the world Server promises to run P on x but can run any program of its choosing (in addition) Server can monitor I/O channels of processor x, P P(x)

Outline Motivation Possible Solutions Fully Homomorphic Encryption Tamper Resistant Hardware Ascend Processor Streaming applications

How to meet security requirement? Cryptographic solution Fully Homomorphic Encryption (FHE) Hardware solution Tamper Resistant Hardware

FHE (Fully Homomorphic Encryption) Limitation of FHE The most efficient FHE scheme so far incurs 10 9 x performance overhead for streamline code Control flow in the program (branch, loop, etc.) may potentially incur orders of magnitude additional overhead. FHE is far from being practical. Enc(x), P Enc( P(x) )

Tamper Resistant Hardware Tamper resistant hardware The secure processor is trusted, shares secret key with client. Private information stored in the hardware is not accessible through external means. examples: IBM 4758, XOM, Aegis, NGSCB, TrustZone, TPM, TPM + SVM, TPM + TXT, SecureBlue++, SGX

Tamper Resistant Hardware Limitations Just trusting the tamper resistance of the chip is not enough! I/O channels of the secure processor can be monitored by software and leak information Examples: address channel, I/O timing channel Main Memory

Leakage through Address Channel for i = 1 to N if (x == 0) sum += A[i] else sum += A[0] Address sequence: 0x00, 0x01, 0x02 Address sequence: 0x00, 0x00, 0x00 The value of x is leaked through the access pattern Sensitive data might be exposed by observing the addresses [HIDE, NDSS12]

Leakage through Timing Channel for i = 1 to N if (x == 0) do nothing else access memory No memory access memory access The value of x is leaked by observing whether the memory access happens or not. Malicious software on a secure processor can easily leak sensitive data through chip pins

Outline Motivation Possible Solutions Ascend Processor Two-interactive protocol Path Oblivious RAM Periodic Access Path ORAM integrity Streaming Applications

Ascend Security Goal An adversary cannot learn a user s private information by observing the pin traffic of Ascend. Implies both address channel and timing channel should be protected as well. Main Memory

Ascend: Custom Processor for Secure Computation on Encrypted Data Adversarial Model: - Adversary with full physical access to the data bus - HW TCB = CPU chip (with caches, mem. interface), Package - SW TCB = Application processes, Trusted OS - Not trusted: External storage, in particular, DRAM. Leakage LLC-DRAM boundary: - Data blocks AES - Timing channel of reads/writes Periodic access - Address access pattern of reads/writes Path ORAM Provide Isolated Computation based on Trusted HW for Certified Program Execution Leakage Input/Output: - Private user input PKI - Streaming in server data Fixed I/O scheduler / AES - Termination channel T leaks bits Authenticity/Freshness of DRAM Merkle Tree

Sanctum: Commodity Architecture for Secure Computation Sanctum RISCV prototype Sanctum Software Stack Adversarial Model: - Adversary can only launch remote SW attacks - HW TCB = CPU chip (with caches, mem. interface), Package - Access to DRAM by peripherals is controlled by a trusted MCU (as in Intel SGX) - Sanctum forbids access by untrusted OS to reserved DRAM for Secure Enclaves - SW TCB = App. Mod. (in Secure Enclave @ lowest priv. level), Sec. Monitor (@ highest priv. level) - Allows multiple modules (Sanctum prevents cache timing channel attacks using locality preserving cache-coloring) - Allows untrusted OS Sanctum: - Small set of changes to an existing architecture All doable in FPGA - Modified commodity software stack (hypervisor friendly) - Permits security monitor SW to be replaced - Application on a standard OS can create coexisting Secure Enclaves (ala SGX) for Isolated Computation with Remote Attestation - No Data Encryption, Memory Integrity Checking, or ORAM needed - Long lived enclaves with demand paging require page-level ORAM

Oblivious RAM (ORAM) Client limited storage, trusted Server ample storage, untrusted Correctness: a valid RAM Request Sequence A: Read(a1), Write(a2, d ), Obfuscated sequence ORAM(A) seen by the server Security: for A and A, if A = A ), then ORAM(A) ~ ORAM(A )

Oblivious RAM Oblivious RAM (ORAM) [1] ORAM allows a client to conceal its access pattern to the remote storage by continuously shuffling and re-encrypting data as they are accessed. Any two access sequences of same length are computationally indistinguishable. ORAM does not protect timing channel, i.e., when addresses are made can still leak information. [1] O. Goldreich and R. Ostrovsky. Software protection and simulation on oblivious rams. J. ACM, 1996.

Oblivious RAM (ORAM) Client Client Compute Client Node A A ORAM Algorithm ORAM Algorithm Interface ORAM(A) ORAM(A) Untrusted Storage storage Untrusted storage Disk Correctness: a valid RAM Secure Client Processor A Request Sequence A: Read(a1), Write(a2, d ), ORAM Controller Algorithm ORAM(A) Untrusted External storage DRAM Obfuscated sequence ORAM(A) seen by the server Security: for A and A, if A = A, then ORAM(A) ~ ORAM(A )

Naïve Oblivious RAM Naïve ORAM Each access touches all the N data blocks in main memory Blocks are read, re-encrypted using probabilistic encryption and written back Dummy blocks are filled to obfuscate memory footprint. O(N) overhead unacceptable

L levels Path ORAM Efficient and simple External DRAM structured as a binary tree Trusted Coprocessor ORAM Interface on-chip Path ORAM Binary Tree off-chip Leafs 0 1 2 3

L levels Path ORAM Position Map: map each block to a random path Trusted Coprocessor on-chip Invariant: if a block is mapped to a path, it must be on that path or in the stash Stash: temporarily holds some blocks Stash ORAM Interface Position Map Path ORAM Binary Tree off-chip Leafs 0 1 2 3

Path ORAM Example Load Block A 1) Lookup position map s = PosMap(A) 2) Load the path into stash 3) Return data block A to processor 4) Remap A to a random leaf PosMap(A) = rand() 5) Write back the path Trusted Coprocessor ORAM Interface Stash leaf 0 A Path ORAM on-chip address A Position Map off-chip Leafs 0 1 2 3

Path ORAM Example Load Block A 1) Lookup position map s = PosMap(A) 2) Load the path into stash 3) Return data block A to processor 4) Remap A to a random leaf PosMap(A) = rand() 5) Write back the path Trusted Coprocessor Stash Path ORAM ORAM Interface A,2 on-chip address A Position Map off-chip Leafs 0 1 2 3

Path ORAM Example Load Block A 1) Lookup position map s = PosMap(A) 2) Load the path into stash 3) Return data block A to processor 4) Remap A to a random leaf PosMap(A) = rand() 5) Write back the path Trusted Coprocessor ORAM Interface Stash A,2 Path ORAM on-chip address A Position Map off-chip Leafs 0 1 2 3

Path Oblivious RAM Path ORAM* Each access only touches O(log N) data blocks. The most practical ORAM scheme known to date For blocks size bigger than ω(log 2 N), Path ORAM is asymptotically better than the best known ORAM scheme with small client storage. * Path ORAM: An Extremely Simple Oblivious RAM Protocol, Emil Stefanov, Marten van Dijk, Elaine Shi, Christopher Fletcher, Ling Ren, Xiangyao Yu, Srinivas Devadas, CCS 2013.

Optimizing ORAM Need Recursive ORAM for storing the Position Map: Position map Lookup Buffer (PLB) caches recent position map blocks Unified ORAM uses one tree to store the Recursive ORAM C. W. Fletcher, L. Ren, A. Kwon, M. van Dijk, and S. Devadas, Freecursive ORAM: [Nearly] Free Recursion and Integrity Verification for Position-based Oblivious RAM, ASPLOS 2015 C. W. Fletcher, L. Ren, A. Kwon, M. van Dijk, E. Stefanov, D. Serpanos, and S. Devadas, A Low-Latency, Low- Area Hardware Oblivious RAM Controller, FCCM 2015 Prefetcher: X. Yu, S. K. Haider, L. Ren, C. W. Fletcher, A. Kwon, M. van Dijk, and S. Devadas, PrORAM: Dynamic Prefetcher for Oblivious RAM, ISCA 2015

Ring ORAM Make evictions more efficient: Reverse lexicographic order: better load balancing allows 1 eviction per A accesses Read only 1 block per node Add Y reserved dummy slots, and permute the blocks in nodes Block of interest or a fresh dummy Re-permute if out of fresh dummies Best bandwidth (without server computation) Ring ORAM: L. Ren, C. W. Fletcher, A. Kwon, E. Stefanov, E. Shi, M. van Dijk, and S. Devadas, Constants Count: Practical Improvements to Oblivious RAM, USENIX Security 2015

Other Adversarial Models Write-only ORAM: A new Flat ORAM (under submission) with optimizations from previous slides gives 1.5-3X performance penalty A weaker adversarial model remote SW attacks -- realistic for datacenter infrastructures: Sanctum No ORAM needed Demand paging for long running enclaves needs page-level ORAM (in enclave SW) If the adversary is able to physically tap the memory bus (as in Ascend), then we need the full cacheline level ORAM (as in Ascend) with an associated 3-10X performance penalty

Untrusted Domain (External DRAM) Trusted Domain (On Chip) Flat ORAM Return Data 3 Program Address a 1 Stash Data ORAM Controller 4 Remap Position Map Get Leaf s=5 Read Path 2 Write Path 5 Leaf 0 1 2 3 4 5 6 7 28

Untrusted Domain (External DRAM) Trusted Domain (On Chip) Flat ORAM Return Data 3 Program Address a Stash ORAM Controller 2 Read Block 4 Remap Position Map 1 Get Index s = 5 Write Block 5 Index 0 1 2 3 4 5 6 7 29

DRAM Flat ORAM Processor Chip On-Chip Map Hierarchy 2 P6 P7 P8 Hierarchy 1 P0 P1 P2 P3 P4 P5 Hierarchy 0 D0 D1 D2 D3 D4 D5 D6 D7 O0 O1 O2 O3 D Data Blocks P PosMap Blocks O OccMap Blocks 30

Onion ORAM: Oblivious Access to Cloud Storage Onion ORAM assumes server computation: Achieves O(1) bandwidth: Reads one block per path Uses PIR to execute both block requests and block evictions PIR based an Additive Homomorphic Encryption Selects Y i Y 1, Y 2, Y m without revealing i User sends E x 1, E x 2,, E x m where x i = 1 and other x j = 0 Server evaluates Y 1 E x 1 Y 2 E x 2 Y m E x m Adds a layer of encryption during eviction (like onion rings) each block moves at least one level down leaf blocks are refreshed by the user = E Y 1 x 1 + Y 2 x 2 + + Y m x m = E(Y i )

Onion ORAM: Oblivious Access to Cloud Storage O(1) bandwidth blowup Moderate server computation Big block size Secure and correct in the malicious setting S. Devadas, M. van Dijk, C. W. Fletcher, L. Ren, E. Shi, and D. Wichs, Onion ORAM: A Constant Bandwidth Blowup RAM, TCC 2016

Timing Channel Protection Access ORAM at a fixed rate. Issue a dummy access when an ORAM access should be made but there is no real request Notation: ORAM interval (Oint) Make an ORAM access Oint cycles after the previous one returns

Data Integrity So far, we have been assuming server just watches traffic and runs arbitrary software. What if the server modifies the user s data in the ORAM? Data Integrity in ORAM is required.* Data integrity refers to maintaining and assuring the accuracy and consistency of data over its entire life-cycle. 17% latency overhead on top of recursive Path ORAM * Integrity Verification for Path Oblivious-RAM, Ling Ren, Christopher W. Fletcher, Xiangyao Yu, Marten van Dijk and Srinivas Devadas, HPEC 2013.

Motivation Challenges and Solutions Ascend Processor Stream applications Limitation of Ascend in Batch mode Secure Interaction with Ascend Outline

Limitation of Ascend Batch computation model The channel between the user and Ascend is only used at beginning and end of computation. No interaction during computation User input/output, network, disk, etc. Oblivious RAM Network Disk

Secure Interaction with Ascend Intuition As long as the I/O of Ascend does not depend on private data, no sensitive information is leaked. The server does not know any private data. If the server completely controls Ascend s I/O behavior, no sensitive information is leaked.

Streaming Applications Document Matching DNA Sequence Matching Content-Based Image Retrieval (CBIR) In all these applications, client has an encrypted document/gene sequence/image that has to be matched privately with a large public data set that is streamed in

Hardware Platform Evaluation Oracle-Ascend: No overhead in input thread Oracle-Baseline: Oracle-Ascend with DRAM as the main memory Stream-Ascend: choose streaming interval carefully Small performance overhead because data records fit in cache, and ORAM is rarely accessed. Applications Oracle-Ascend Stream-Ascend Document Matching < 0.1% 24.5% DNA Sequence Matching < 0.1% 0.7% Content-Based Image Retrieval 2.6% 3.9%

The Full Picture Network package, user input, etc. 0 < Time < T Disk Time = 0 Public Input (from the server) Enc(Private input) (from the user) Hash 4 Hash 5 Hash 0 Hash 1 Hash 2 Hash 3 Ascend Top hash Input Scheduler (public FSM) Output Scheduler (public FSM) Server (untrusted) FEF BEF Input Thread Output Thread SIB SUB Main Program Application Thread AE S ORAM Interface On-chip Position Map Stash A E S Periodic Access Data Path ORAM Public Information package size, arrival time, etc. Time = T Enc( final result ) Position Map ORAMs User