Understanding and Automatically Preventing Injection Attacks on Node.js Michael Pradel TU Darmstadt Joint work with Cristian Staicu (TU Darmstadt) and Ben Livshits (Microsoft Research, Redmond) 1
Why JavaScript? Relevant and challenging Rank of top languages on GitHub over time (Source: GitHub.com) 2
Why JavaScript? Relevant and challenging 1096 pages 153 pages 3
Motivation: JavaScript (In)Security JavaScript: Popular beyond the browser Client-side web app Browser Operating system 4
Motivation: JavaScript (In)Security JavaScript: Popular beyond the browser Client-side web app Server-side or desktop app Mobile app Browser Node.js Dalvik VM Operating system Operating system Operating system 4
Motivation: JavaScript (In)Security JavaScript: Popular beyond the browser Sandbox Client-side web app Server-side or desktop app Sandbox Mobile app Browser Node.js Dalvik VM Operating system Operating system Operating system 4
Motivation: JavaScript (In)Security JavaScript: Popular beyond the browser Sandbox Client-side web app No sandbox! Server-side or desktop app Sandbox Mobile app Browser Node.js Dalvik VM Operating system Operating system Operating system 4
Culture of Naive Reuse Node.js code: Builds on 3rd-party code Over 300.000 modules No specified trust relationships between modules Many indirect dependences 5
Culture of Naive Reuse Node.js code: Builds on 3rd-party code Over 300.000 modules No specified trust relationships between modules Many indirect dependences Risk of vulnerable and malicious code 5
Real Example: Growl Module var msg = /* receive growl(msg); from network */ 6
Real Example: Growl Module var msg = /* receive growl(msg); from network */ Growl module: Platform-specific command to show notifications Pass message to command without any checks 6
Running Example function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); var kind = (ext === "jpg")? "pics" : "other"; console.log(eval("messages.backup_" + kind)); } 7
Running Example function backupfile(name, ext) { } var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); Construct shell command Execute it var kind = (ext === "jpg")? "pics" : "other"; console.log(eval("messages.backup_" + kind)); 7
Running Example function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); var kind = (ext === "jpg")? "pics" : "other"; console.log(eval("messages.backup_" + kind)); } Construct JavaScript code and execute it 7
Running Example function backupfile(name, ext) { var cmd = []; } cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); Injection APIs: Interpret string as code var kind = (ext === "jpg")? "pics" : "other"; console.log(eval("messages.backup_" + kind)); 7
Running Example function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); var kind = (ext === "jpg")? "pics" : "other"; console.log(eval("messages.backup_" + kind)); } Injection attack: backupfile("-h && rm -rf * && echo ", "") 7
Our Contributions 1. Study of injection vulnerabilities First large-scale study of Node.js security 236K modules, 816M lines of JavaScript 2. Repair of vulnerabilities Static analysis and runtime enforcement Automatic and easy to deploy Small overhead and high accuracy 8
Our Contributions 1. Study of injection vulnerabilities First large-scale study of Node.js security 236K modules, 816M lines of JavaScript 2. Repair of vulnerabilities Static analysis and runtime enforcement Automatic and easy to deploy Small overhead and high accuracy 8
Study: Prevalence Are injection vulnerabilities widespread? 9
Study: Prevalence Are injection vulnerabilities widespread? 9
Study: Prevalence Are injection vulnerabilities widespread? Direct uses 9
Study: Prevalence Are injection vulnerabilities widespread? Indirect uses via other modules 9
Study: Prevalence Are injection vulnerabilities widespread? Manual inspection of 150 call sites Attacker-controlled data may reach API: 58% Defense mechanisms None: 90% Regular expression: 9% 9
Study: Developer Reactions Do developers fix vulnerabilities? Reported 20 previously unknown vulnerabilities After several months, only 3 fixed 10
Study: Developer Reactions Do developers fix vulnerabilities? Reported 20 previously unknown vulnerabilities After several months, only 3 fixed 10
Study: Developer Reactions Do developers fix vulnerabilities? Reported 20 previously unknown vulnerabilities After several months, only 3 fixed Need mitigation technique that requires very little developer attention 10
Our Contributions 1. Study of injection vulnerabilities First large-scale study of Node.js security 236K modules, 816M lines of JavaScript 2. Repair of vulnerabilities Static analysis and runtime enforcement Automatic and easy to deploy Small overhead and high accuracy 11
Our Contributions 1. Study of injection vulnerabilities First large-scale study of Node.js security 236K modules, 816M lines of JavaScript 2. Repair of vulnerabilities Static analysis and runtime enforcement Automatic and easy to deploy Small overhead and high accuracy 11
Preventing Injections Vulnerable code Static analysis String templates Statically safe code Synthesize policy Code with Runtime inputs runtime checks Dynamic enforcement Safe runtime behavior 12
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree 13
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); } 13
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); } $cmd join 13
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); } join push $cmd /.localbackup/ 13
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); } $cmd push join push /.localbackup/ + $name. $ext 13
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree function backupfile(name, ext) { var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); } join push push /.localbackup/ push + $cmd cp $name. $ext 13
Static Analysis: Template Trees 1. Backward data flow analysis Overapproximate strings passed to injection API Represent possible values as a tree function backupfile(name, ext) { } var cmd = []; cmd.push("cp"); cmd.push(name + "." + ext); cmd.push(" /.localbackup/"); exec(cmd.join(" ")); empty array push push push join /.localbackup/ + cp $name. $ext 13
Static Analysis: Templates 2. Evaluate template trees into templates Statically model operations (bottom-up) Unknown parts to be filled at runtime 14
Static Analysis: Templates 2. Evaluate template trees into templates Statically model operations (bottom-up) Unknown parts to be filled at runtime join push push /.localbackup/ push + empty array cp $name. $ext cp $name.$ext /.localbackup/ 14
Synthesizing a Policy Create runtime policy from templates Enforce structure via partial AST For unknown parts, allow only benign AST nodes 15
Synthesizing a Policy Create runtime policy from templates Enforce structure via partial AST For unknown parts, allow only benign AST nodes cp $name.$ext /.localbackup/ Bash grammar Command Arguments cp??? /.localbackup/ 15
Runtime Enforcement Enforce policy on strings passed to injection APIs Policy: Command Arguments cp??? /.localbackup/ 16
Runtime Enforcement Enforce policy on strings passed to injection APIs Policy: Runtime string: cp f.txt /.localbackup/ Command Command Arguments Arguments cp cp??? /.localbackup/ f.txt /.localbackup/ 16
Runtime Enforcement Enforce policy on strings passed to injection APIs Policy: Runtime string: cp f.txt /.localbackup/ Command Command Accepted Arguments Arguments cp cp??? /.localbackup/ f.txt /.localbackup/ 16
Runtime Enforcement Enforce policy on strings passed to injection APIs Policy: Command Runtime string: cp -h && rm -rf * && echo /.localbackup/ CompoundCmd Arguments Command Command Command cp.........??? /.localbackup/... 16
Runtime Enforcement Enforce policy on strings passed to injection APIs Policy: Command Runtime string: cp -h && rm -rf * && echo /.localbackup/ CompoundCmd Arguments Command Command Command cp.........??? /.localbackup/... Rejected 16
Evaluation: Static Analysis Setup: 51K call sites of injection APIs Precision: Statically safe: To be checked at runtime: 63.3% 36.7% Most call sites: Performance: 4.4 seconds per module At least 10 known characters Only 1 hole 17
Evaluation: Runtime Enforcement Setup 24 modules 56 benign and 65 malicious inputs Results: Zero false negatives (i.e., no missed injections) Five false positives (i.e., overly conservative) Overhead (avg.): 0.74 milliseconds per call 18
Conclusion Understand injection vulnerabilities First large-scale empirical study of Node.js (in)security Detect and prevent injections Static inference of expected string values AST-based runtime policy Automated repair of vulnerabilities More details: Technical report on my web site 19