Who am I? I m a python developer who has been working on OpenStack since I currently work for Aptira, who do OpenStack, SDN, and orchestration

Similar documents
P1_L3 Operating Systems Security Page 1

How to Improve Your Campaign Conversion Rates

PROFESSOR: Last time, we took a look at an explicit control evaluator for Lisp, and that bridged the gap between

I'm Andy Glover and this is the Java Technical Series of. the developerworks podcasts. My guest is Brian Jakovich. He is the

MITOCW ocw f99-lec07_300k

Formal Methods of Software Design, Eric Hehner, segment 24 page 1 out of 5

It Might Be Valid, But It's Still Wrong Paul Maskens and Andy Kramek

MITOCW watch?v=w_-sx4vr53m

MITOCW watch?v=9h6muyzjms0

This paper was presented at DVCon-Europe in November It received the conference Best Paper award based on audience voting.

MITOCW watch?v=4dj1oguwtem

The Stack, Free Store, and Global Namespace

In our first lecture on sets and set theory, we introduced a bunch of new symbols and terminology.

The following content is provided under a Creative Commons license. Your support

MITOCW watch?v=yarwp7tntl4

Cleaning up Exited Docker Containers with Help from Awk and Grep

Blog post on updates yesterday and today:

MITOCW watch?v=0jljzrnhwoi

MITOCW watch?v=se4p7ivcune

BBC Learning English Face up to Phrasals Mark's Mistake

Black Problem 2: Huffman Compression [75 points] Next, the Millisoft back story! Starter files

Week - 01 Lecture - 04 Downloading and installing Python

COMP2100/2500 Lecture 17: Shell Programming II

The following content is provided under a Creative Commons license. Your support

(Refer Slide Time: 02.06)

Physics REU Unix Tutorial

The following content is provided under a Creative Commons license. Your support

MITOCW watch?v=zlohv4xq_ti

MITOCW watch?v=rvrkt-jxvko

Securing Unix Filesystems - When Good Permissions Go Bad

Post Experiment Interview Questions

MITOCW watch?v=zm5mw5nkzjg

The following content is provided under a Creative Commons license. Your support

KMyMoney Transaction Matcher

Formal Methods of Software Design, Eric Hehner, segment 1 page 1 out of 5

============================================================================

Is this a known issue? Seems to affect only recurring events. I have some of them and all are shifted. Non-recurring events show properly.

Advanced topics, part 2

Shell Scripting. Todd Kelley CST8207 Todd Kelley 1

Hash Tables (Cont'd) Carlos Moreno uwaterloo.ca EIT

Digital Marketing Manager, Marketing Manager, Agency Owner. Bachelors in Marketing, Advertising, Communications, or equivalent experience


6.001 Notes: Section 8.1

Q&A Session for Connect with Remedy - CMDB Best Practices Coffee Break

PROFESSOR: So far in this course we've been talking a lot about data abstraction. And remember the idea is that

AHHHHHHH!!!! NOT TESTING! Anything but testing! Beat me, whip me, send me to Detroit, but don t make me write tests!

Chrome if I want to. What that should do, is have my specifications run against four different instances of Chrome, in parallel.


Slide 1 CS 170 Java Programming 1 Testing Karel

The following content is provided under a Creative Commons license. Your support

Module 6. Campaign Layering

The following content is provided under a Creative Commons license. Your support

Skill 1: Multiplying Polynomials

Well, Hal just told us how you build robust systems. The key idea was-- I'm sure that many of

PROFESSOR: Well, now that we've given you some power to make independent local state and to model objects,

Troubleshooting Maple Worksheets: Common Problems

CS144 Final Review. Dec 4th, 2009 Tom Wiltzius

Binary Search Trees. Carlos Moreno uwaterloo.ca EIT

CSE 142/143 Unofficial Commenting Guide Eric Arendt, Alyssa Harding, Melissa Winstanley

CS 1110, LAB 3: MODULES AND TESTING First Name: Last Name: NetID:

"Missing log" in edit viewer, all media gone Posted by prodeuser - 17 Aug :14

MITOCW watch?v=ytpjdnlu9ug

Web Hosting. Important features to consider

mismatch between what is maybe possible today and what is going on in many of today's IDEs.

MITOCW ocw apr k

Tim moves to accept, Chris Z seconds. No objections or comments.

Meet our Example Buyer Persona Adele Revella, CEO

Linked Lists. What is a Linked List?

CS Reading Packet: "Database Processing and Development"

A shell can be used in one of two ways:

Out for Shopping-Understanding Linear Data Structures English

Everything you need to know about cloud. For companies with people in them

The name of our class will be Yo. Type that in where it says Class Name. Don t hit the OK button yet.

Chris' Makefile Tutorial

Instructor: Craig Duckett. Lecture 03: Tuesday, April 3, 2018 SQL Sorting, Aggregates and Joining Tables

Datacenter Care HEWLETT PACKARD ENTERPRISE. Key drivers of an exceptional NPS score

A PROGRAM IS A SEQUENCE of instructions that a computer can execute to

MITOCW watch?v=flgjisf3l78

Week - 01 Lecture - 03 Euclid's Algorithm for gcd. Let us continue with our running example of gcd to explore more issues involved with program.

Final Examination CS 111, Fall 2016 UCLA. Name:

The following content is provided under a Creative Commons license. Your support

(Refer Slide Time: 01:25)

Getting Started with Contributing to OpenStack An Introductory Crash Course on OpenStack Development

Most of the class will focus on if/else statements and the logical statements ("conditionals") that are used to build them. Then I'll go over a few

With Tom Kyte and Andy Mendelsohn. plus special guest appearances: Dan McGhan, Natalka Roshak, Chris Saxon

Hi everyone. I hope everyone had a good Fourth of July. Today we're going to be covering graph search. Now, whenever we bring up graph algorithms, we

Spam. Time: five years from now Place: England

In further discussion, the books make other kinds of distinction between high level languages:

MITOCW watch?v=i0psvvky-44

PYTHON PROGRAMMING FOR BEGINNERS: AN INTRODUCTION TO THE PYTHON COMPUTER LANGUAGE AND COMPUTER PROGRAMMING BY JASON CANNON

CS3 Midterm 1 Fall 2006

MITOCW MIT6_172_F10_lec18_300k-mp4

Basic Fiction Formatting for Smashwords in OpenOffice L. Leona Davis. Copyright 2012 L. Leona Davis All Rights Reserved

What's the Slope of a Line?

Oversimplified DNS. ... or, even a rocket scientist can understand DNS. Step 1 - Verify WHOIS information

Staff Intranet Survey Results

Speech 2 Part 2 Transcript: The role of DB2 in Web 2.0 and in the IOD World

Note Takers: Chau, Bao Kham (cs162-bb) and Quang Tran (cs162-bc) Topic: Real time system

An Illustrated Guide to Shell Magic: Standard I/O & Redirection

Contents. Note: pay attention to where you are. Note: Plaintext version. Note: pay attention to where you are... 1 Note: Plaintext version...

Transcription:

Who am I? I m a python developer who has been working on OpenStack since 2011. I currently work for Aptira, who do OpenStack, SDN, and orchestration consulting. I m here today to help you learn from my fail. 1

OpenStack is an orchestration system for setting up virtual machines and associated other virtual resources such as networks and storage on clusters of computers. At a high level, OpenStack is just configuring existing facilities of the host operating system -- there isn't really a lot of difference between OpenStack and a room full of system admins frantically resolving tickets requesting virtual machines be setup. The only real difference is scale and predictability. To do its job, OpenStack needs to be able to manipulate parts of the operating system which are normally reserved for administrative users. This talk is the story of how OpenStack has done that thing over time, what we learnt along the way, and what I'd do differently if I had my time again. Lots of systems need to do these things, so even if you never use OpenStack hopefully there are things to be learnt here. 2

That said, someone I respect suggested last weekend that good conference talks are actionable. A talk full of OpenStack war stories isn't actionable, so I've spent the last week re-writing this talk to hopefully be more of a call to action than just an interesting story. I apologise for any mismatch between the original proposal and what I present here that might therefore exist. 3

Back to the task in hand though -- providing control of virtual resources to untrusted users. OpenStack has gone through several iterations of how it thinks this should be done, so perhaps its illustrative to start by asking how other similar systems achieve this. There are lots of systems that have a requirement to configure privileged parts of the host operating system. The most obvious example I can think of is Docker. How does Docker do this? Well... its actually not all that pretty. Docker presents its API over a unix domain socket by default in order to limit control to local users (you can of course configure this). So to provide access to Docker, you add users to the docker group, which owns that domain socket. The Docker documentation warns that "the docker group grants privileges equivalent to the root user". So that went well. Docker is really an example of the simplest way of solving this problem -- by not solving it at all. That works well enough for systems where you can tightly control the users who need access to those privileged operations -- in Docker's case by making them have an account in the right group on the system and logging in locally. However, OpenStack's whole point is to let untrusted remote users create virtual machines, so we're going to have to do better than that. 4

The next level up is to do something with sudo. The way we all use sudo day to day, you allow users in the sudoers group to become root and execute any old command, with a configuration entry that probably looks a little like this. Now that config entry is basically line noise, but it says "allow members of the group called sudo, on any host, to run any command as root". You can of course embed this into your python code using subprocess.call() or similar. 5

On the security front, its possible to do a little bit better than a "nova can execute anything" entry. For example, this says that the sudo group on all hosts can execute /bin/ls with any arguments. OpenStack never actually specified the complete list of commands it executed. That was left as a job for packagers, which of course meant it wasn't done well. 6

So there's our first actionable thing -- if you assume that someone else (packagers, the ops team, whoever) is going to analyse your code well enough to solve the security problem that you can't be bothered solving, then you have a problem. Now, we weren't necessarily deliberately punting here. Its obvious to me how to grep the code for commands run as root to add them to a sudo configuration file, but that's unfair. I wrote some of this code, I am much closer to it than a system admin who just wants to get the thing deployed. 7

We can of course do better than just raw sudo. Next we tried a thing called rootwrap, which was mostly an attempt to provide a better boundary around exactly what commands you can expect an OpenStack binary to execute. So for example, maybe its ok for me to read the contents of a configuration file specific to a virtual machine I am managing, but I probably shouldn't be able to read /etc/shadow or whatever. We can do that by doing something like the example shown where, where nova-rootwrap is a program which takes a configuration file and a command line to run. The contents of the configuration file are used to determine if the command line should be executed. Now we can limit the sudo configuration file to only needing to be able to execute nova-rootwrap. I thought about putting in a whole bunch of slides about exactly how to configure rootwrap, but then I realised that this talk is only 25 minutes and you can totally google that stuff. 8

So instead, here's my second actionable thing... Is there a trivial change you can make which will dramatically improve security? I don't think anyone would claim that rootwrap is rocket science, but it improved things a lot -- deployers didn't need to grep out the command lines we executed any more, and we could do things like specify what paths we were allowed to do things in. Are there similarly trivial changes that you can make to improve your world? 9

But wait! Here's my third actionable thing as well -- what are the costs of your design? Some of these are obvious -- for example with this design executing something with escalated permissions causes us to pay to fork a process. In fact its worse with rootwrap, because we pay to fork, start a python interpreter to parse a configuration file, and then fork again for the actual binary we wanted in the first place. That cost adds up if you need to execute many small commands, for example when plugging in a new virtual network interface. At one point we measured this for network interfaces and the costs were in the tens of seconds per interface. There is another cost though which I think is actually more important. The only way we have with this mechanism to do something with escalated permissions is to execute it as a separate process. This is a horrible interface and forces us to do some really weird things. Let's checkout some examples... 10

Which of the following commands are reasonable? These are just some examples, there are many others. The first is probably the most reasonable. It doesn't seem wise to me for us to implement our own data shredding code, so using a system command for that seems reasonable. The other examples are perhaps less reasonable -- the rm one is particularly scary to me. But none of these are the best example... 11

How about this one? Some commentary first. This code existed in the middle of a method that does other things. Its one of five command lines that method executes. What does it do? Its actually not too bad. Using root permissions, it writes a zero to the multicast_snooping sysctl for the network bridge being setup. It then checks the exit code and raises an exception if its not 0 or 1. That said, its also horrid. In order to write a single byte to a sysctl as root, we are forced to fork, start a python process, read a configuration file, and then fork again. For an operation that in some situations might need to happen hundreds of times for OpenStack to restart on a node. 12

This is how we get to the third way that OpenStack does escalated permissions. If we could just write python code that ran as root, we could write this instead. Its not perfect, but its a lot cheaper to execute and we could put it in a method with a helpful name like "disable multicast snooping" for extra credit. Which brings us to... 13

Hire Angus Lees and make him angry. Angus noticed this problem well before the rest of us. We were all lounging around basking in our own general cleverness. What Angus proposed is that instead of all this forking and parsing and general mucking around, that we just start a separate process as at startup with special permissions, and then send it commands to execute. He could have done that with a relatively horrible API, for example just sending command lines down the pipe and getting their responses back to parse, but instead he implemented a system of python decorators which let us call a method which is marked up as saying "I want to run as root!". 14

So here's the destination in our journey, how we actually do that thing in OpenStack now. The decorator before the method definition is a bit opaque, but basically says "run this thing as root", and the rest is a method which can be called from anywhere within our code. There are a few things you need to do to setup privsep, but I don't have time in this talk to discuss the specifics. Effectively you need to arrange for the privsep helper to start with escalated permissions, and you need to move the code which will run with one of these decorators to a sub path of your source tree to stop other code from accidentally being escalated. privsep is also capable of running with more than one set of permissions -- it will start a helper for each set. That's what this decorator is doing, specifying what permissions we need for this method. 15

And here we land at my final actionable thing. Make it easy to do the right thing, and hard to do the wrong thing. Rusty Russell used to talk about this at linux.conf.au when he was going through a phase of trying to clean up kernel APIs -- its important that your interfaces make it obvious how to use them correctly, and make it hard to use them incorrectly. In the example used for this talk, having command lines executed as root meant that the prevalent example of how to do many things was a command line. So people started doing that even when they didn't need escalated permissions -- for example calling mkdir instead of using our helper function to recursively make a set of directories. We've cleaned that up, but we've also made it much much harder to just drop a command line into our code base to run as root, which will hopefully stop some of this problem re-occuring in the future. I don't think OpenStack has reached perfection in this regard yet, but we continue to improve a little each day and that's probably all we can hope for. 16

privsep can be used for non-openstack projects too. There s really nothing specific about most of OpenStack s underlying libraries in fact, and there s probably things there which are useful to you. In fact the real problem is working out what is where because there s so much of it. One final thing -- privsep makes it possible to specify the exact permissions needed to do something. For example, setting up a network bridge probably doesn't need "read everything on the filesystem" permissions. We originally did that, but stepped back to using a singled escalated permissions set that maps to what you get with sudo, because working out what permissions a single operation needed was actually quite hard. We were trying to lower the barrier for entry for doing things the right way. I don't think I really have time to dig into that much more here, but I'd be happy to chat about it sometime this weekend or on the Internet later. 17

18

19

20