jsluice

package module
v0.0.0-...-0ddfab1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 10, 2024 License: MIT Imports: 17 Imported by: 9

README

# jsluice

[![Go Reference](https://pkg.go.dev/badge/github.com/BishopFox/jsluice.svg)](https://pkg.go.dev/github.com/BishopFox/jsluice)

`jsluice` is a Go package and [command-line tool](/cmd/jsluice/) for extracting URLs, paths, secrets,
and other interesting data from JavaScript source code.

If you want to do those things right away: look at the [command-line tool](/cmd/jsluice/).

If you want to integrate `jsluice`'s capabilities with your own project: look at the [examples](/examples/),
and read the [package documentation](https://pkg.go.dev/github.com/BishopFox/jsluice).

## Install

To install the command-line tool, run:

```
▶ go install github.com/BishopFox/jsluice/cmd/jsluice@latest
```

To add the package to your project, run:

```
▶ go get github.com/BishopFox/jsluice
```

## Extracting URLs

Rather than using regular expressions alone, `jsluice` uses `go-tree-sitter` to look for places that URLs are known to be used,
such as being assigned to `document.location`, passed to `window.open()`, or passed to `fetch()` etc.

A simple example program is provided [here](/examples/basic/main.go):

```go
analyzer := jsluice.NewAnalyzer([]byte(`
    const login = (redirect) => {
        document.location = "/login?redirect=" + redirect + "&method=oauth"
    }
`))

for _, url := range analyzer.GetURLs() {
    j, err := json.MarshalIndent(url, "", "  ")
    if err != nil {
        continue
    }

    fmt.Printf("%s\n", j)
}
```

Running the example:
```
▶ go run examples/basic/main.go
{
  "url": "/login?redirect=EXPR\u0026method=oauth",
  "queryParams": [
    "method",
    "redirect"
  ],
  "bodyParams": [],
  "method": "GET",
  "type": "locationAssignment",
  "source": "document.location = \"/login?redirect=\" + redirect + \"\u0026method=oauth\""
}
```

Note that the value of the `redirect` query string parameter is `EXPR`.
Code like this is common in JavaScript:

```javascript
document.location = "/login?redirect=" + redirect + "&method=oauth"
```

`jsluice` understands string concatenation, and replaces any expressions it cannot know the value
of with `EXPR`. Although not a foolproof solution, this approach results in a valid URL or path
more often than not, and means that it's possible to discover things that aren't easily found using
other approaches. In this case, a naive regular expression may well miss the `method` query string
parameter:

```
▶ JS='document.location = "/login?redirect=" + redirect + "&method=oauth"'
▶ echo $JS | grep -oE 'document\.location = "[^"]+"'
document.location = "/login?redirect="
```

### Custom URL Matchers

`jsluice` comes with some built-in URL matchers for common scenarios, but you can add more
with the `AddURLMatcher` function:

```go
analyzer := jsluice.NewAnalyzer([]byte(`
    var fn = () => {
        var meta = {
            contact: "mailto:contact@example.com",
            home: "https://example.com"
        }
        return meta
    }
`))

analyzer.AddURLMatcher(
    // The first value in the jsluice.URLMatcher struct is the type of node to look for.
    // It can be one of "string", "assignment_expression", or "call_expression"
    jsluice.URLMatcher{"string", func(n *jsluice.Node) *jsluice.URL {
        val := n.DecodedString()
        if !strings.HasPrefix(val, "mailto:") {
            return nil
        }

        return &jsluice.URL{
            URL:  val,
            Type: "mailto",
        }
    }},
)

for _, match := range analyzer.GetURLs() {
    fmt.Println(match.URL)
}
```

There's a copy of this example [here](/examples/urlmatcher/main.go). You can run it like this:

```
▶ go run examples/urlmatcher/main.go
mailto:contact@example.com
https://example.com
```

`jsluice` doesn't match `mailto:` URIs by default, it was found by the custom `URLMatcher`.


## Extracting Secrets

As well as URLs, `jsluice` can extract secrets. As with URL extraction, custom matchers can
be supplied to supplement the default matchers. There's a short example program [here](/examples/secrets/main.go)
that does just that:

```go
analyzer := jsluice.NewAnalyzer([]byte(`
    var config = {
        apiKey: "AUTH_1a2b3c4d5e6f",
        apiURL: "https://api.example.com/v2/"
    }
`))

analyzer.AddSecretMatcher(
    // The first value in the jsluice.SecretMatcher struct is a
    // tree-sitter query to run on the JavaScript source.
    jsluice.SecretMatcher{"(pair) @match", func(n *jsluice.Node) *jsluice.Secret {
        key := n.ChildByFieldName("key").DecodedString()
        value := n.ChildByFieldName("value").DecodedString()

        if !strings.Contains(key, "api") {
            return nil
        }

        if !strings.HasPrefix(value, "AUTH_") {
            return nil
        }

        return &jsluice.Secret{
            Kind: "fakeApi",
            Data: map[string]string{
                "key":   key,
                "value": value,
            },
            Severity: jsluice.SeverityLow,
            Context:  n.Parent().AsMap(),
        }
    }},
)

for _, match := range analyzer.GetSecrets() {
    j, err := json.MarshalIndent(match, "", "  ")
    if err != nil {
        continue
    }

    fmt.Printf("%s\n", j)
}
```

Running the example:

```
▶ go run examples/secrets/main.go
[2023-06-14T13:04:16+0100]
{
  "kind": "fakeApi",
  "data": {
    "key": "apiKey",
    "value": "AUTH_1a2b3c4d5e6f"
  },
  "severity": "low",
  "context": {
    "apiKey": "AUTH_1a2b3c4d5e6f",
    "apiURL": "https://api.example.com/v2/"
  }
}
```

Because we have a syntax tree available for the entire JavaScript source,
it was possible to inspect both the `key` and `value`, and also to easily
provide the parent object as context for the match.

Documentation

Index

Constants

This section is empty.

Variables

View Source
var ExpressionPlaceholder = "EXPR"

ExpressionPlaceholder is the string used to replace any expressions when string concatenations are collapsed. E.g:

"prefix" + someVar + "suffix"

Would become:

prefixEXPRsuffix

Functions

func DecodeString

func DecodeString(in string) string

DecodeString accepts a raw string as it might be found in some JavaScript source code, and converts any escape sequences. E.g:

foo\x3dbar -> foo=bar // Hex escapes
foo\u003Dbar -> foo=bar // Unicode escapes
foo\u{003D}bar -> foo=bar // Braced unicode escapes
foo\075bar -> foo=bar // Octal escape
foo\"bar -> foo"bar // Single character escapes

func MaybeURL

func MaybeURL(in string) bool

func PrintTree

func PrintTree(source []byte) string

PrintTree returns a string representation of the syntax tree for the provided JavaScript source

Types

type Analyzer

type Analyzer struct {
	// contains filtered or unexported fields
}

Analyzer could be considered the core type of jsluice. It wraps the parse tree for a JavaScript file and provides mechanisms to extract URLs, secrets etc

func NewAnalyzer

func NewAnalyzer(source []byte) *Analyzer

NewAnalyzer accepts a slice of bytes representing some JavaScript source code and returns a pointer to a new Analyzer

func (*Analyzer) AddSecretMatcher

func (a *Analyzer) AddSecretMatcher(s SecretMatcher)

AddSecretMatcher allows custom SecretMatchers to be added to the Analyzer

func (*Analyzer) AddSecretMatchers

func (a *Analyzer) AddSecretMatchers(ss []SecretMatcher)

AddSecretMatchers allows multiple custom SecretMatchers to be added to the Analyzer

func (*Analyzer) AddURLMatcher

func (a *Analyzer) AddURLMatcher(u URLMatcher)

AddURLMatcher allows custom URLMatchers to be added to the Analyzer

func (*Analyzer) DisableDefaultURLMatchers

func (a *Analyzer) DisableDefaultURLMatchers()

DisableDefaultURLMatchers disables the default URLMatchers, so that only user-added URLMatchers are used.

func (*Analyzer) GetSecrets

func (a *Analyzer) GetSecrets() []*Secret

GetSecrets uses the parse tree and a set of Matchers (those provided by AllSecretMatchers()) to find secrets in JavaScript source code.

func (*Analyzer) GetURLs

func (a *Analyzer) GetURLs() []*URL

GetURLs searches the JavaScript source code for absolute and relative URLs and returns a slice of results.

func (*Analyzer) Query

func (a *Analyzer) Query(q string, fn func(*Node))

Query peforms a tree-sitter query on the JavaScript being analyzed. The provided function is called once for every node that captured by the query. See https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax for details on query syntax.

func (*Analyzer) QueryMulti

func (a *Analyzer) QueryMulti(q string, fn func(QueryResult))

Query peforms a tree-sitter query on the JavaScript being analyzed. The provided function is called for every query match, with captured nodes grouped into a QueryResult See https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax for details on query syntax.

func (*Analyzer) RootNode

func (a *Analyzer) RootNode() *Node

RootNode returns the root note of the parsed JavaScript

type Node

type Node struct {
	// contains filtered or unexported fields
}

Node is a wrapper around a tree-sitter node. It serves as an attachment point for convenience methods, and also to store the raw JavaScript source that is a required argument for many tree-sitter functions.

func NewNode

func NewNode(n *sitter.Node, source []byte) *Node

NewNode creates a new Node for the provided tree-sitter node and a byte-slice containing the JavaScript source. The source provided should be the complete source code and not just the source for the node in question.

func (*Node) AsArray

func (n *Node) AsArray() []any

AsArray returns a representation of the Node as a []any

func (*Node) AsGoType

func (n *Node) AsGoType() any

AsGoType returns a representation of a Node as a native Go type, defaulting to a string containing the JavaScript source for the Node. Return types are:

string => string
number => int, float64
object => map[string]any
array  => []any
false  => false
true   => true
null   => nil
other  => string

func (*Node) AsMap

func (n *Node) AsMap() map[string]any

AsMap returns a representation of the Node as a map[string]any

func (*Node) AsNumber

func (n *Node) AsNumber() any

AsNumber returns a representation of the Node as an int or float64.

Note: hex, octal etc number formats are currently unsupported

func (*Node) AsObject

func (n *Node) AsObject() Object

AsObject returns a Node as jsluice's internal object type, to allow the fetching of keys etc

func (*Node) CaptureName

func (n *Node) CaptureName() string

CaptureName returns the name given to a node in a query if one exists, and an empty string otherwise

func (*Node) Child

func (n *Node) Child(index int) *Node

Child returns the child Node at the provided index

func (*Node) ChildByFieldName

func (n *Node) ChildByFieldName(name string) *Node

Fetches a child Node from a named field. For example, the 'pair' node has two fields: key, and value.

func (*Node) ChildCount

func (n *Node) ChildCount() int

ChildCount returns the number of children a node has

func (*Node) Children

func (n *Node) Children() []*Node

Childten returns a slide of *Node containing all children for a node

func (*Node) CollapsedString

func (n *Node) CollapsedString() string

CollapsedString takes a node representing a URL and attempts to make it at least somewhat easily parseable. It's common to build URLs out of variables and function calls so we want to turn something like:

'./upload.php?profile='+res.id+'&show='+$('.participate_modal_container').attr('data-val')

Into something more like:

./upload.php?profile=EXPR&show=EXPR

The value of ExpressionPlaceholder is used as a placeholder, defaulting to 'EXPR'

func (*Node) Content

func (n *Node) Content() string

Content returns the source code for a particular node.

func (*Node) DecodedString

func (n *Node) DecodedString() string

DecodedString returns a fully decoded version of a JavaScript string. It is just a convenience wrapper around the DecodeString function.

func (*Node) ForEachChild

func (n *Node) ForEachChild(fn func(*Node))

ForEachChild iterates over a node's children in a depth-first manner, calling the supplied function for each node

func (*Node) ForEachNamedChild

func (n *Node) ForEachNamedChild(fn func(*Node))

ForEachNamedChild iterates over a node's named children in a depth-first manner, calling the supplied function for each node

func (*Node) Format

func (n *Node) Format() (string, error)

Format outputs a nicely formatted version of the source code for the Node. Formatting is done by https://github.com/ditashi/jsbeautifier-go/

func (*Node) IsNamed

func (n *Node) IsNamed() bool

IsNamed returns true if the underlying node is named

func (*Node) IsStringy

func (n *Node) IsStringy() bool

IsStringy returns true if a Node is a string or is an expression starting with a string (e.g. a string concatenation expression).

func (*Node) IsValid

func (n *Node) IsValid() bool

IsValid returns true if the *Node and the underlying tree-sitter node are both not nil.

func (*Node) NamedChild

func (n *Node) NamedChild(index int) *Node

NamedChild returns the 'named' child Node at the provided index. Tree-sitter considers a child to be named if it has a name in the syntax tree. Things like brackets are not named, but things like variables and function calls are named. See https://tree-sitter.github.io/tree-sitter/using-parsers#named-vs-anonymous-nodes for more details.

func (*Node) NamedChildCount

func (n *Node) NamedChildCount() int

NamedChildCount returns the number of named children a Node has.

func (*Node) NamedChildren

func (n *Node) NamedChildren() []*Node

NamedChildren returns a slice of *Node containg all named children for a node.

func (*Node) NextNamedSibling

func (n *Node) NextNamedSibling() *Node

NextNamedSibling returns the next named sibling in the tree

func (*Node) NextSibling

func (n *Node) NextSibling() *Node

NextSibling returns the next sibling in the tree

func (*Node) Parent

func (n *Node) Parent() *Node

Parent returns the Parent Node for a Node

func (*Node) PrevNamedSibling

func (n *Node) PrevNamedSibling() *Node

PrevNamedSibling returns the previous named sibling in the tree

func (*Node) PrevSibling

func (n *Node) PrevSibling() *Node

PrevSibling returns the previous sibling in the tree

func (*Node) Query

func (n *Node) Query(query string, fn func(*Node))

Query executes a tree-sitter query on a specific Node. Nodes captured by the query are passed one at a time to the provided callback function.

See https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries for query syntax documentation.

func (*Node) QueryMulti

func (n *Node) QueryMulti(query string, fn func(QueryResult))

QueryMulti executes a tree-sitter query on a specific Node. Nodes captured by the query are grouped into a QueryResult and passed to the provided callback function.

See https://tree-sitter.github.io/tree-sitter/using-parsers#pattern-matching-with-queries for query syntax documentation.

func (*Node) RawString

func (n *Node) RawString() string

RawString returns the raw JavaScript representation of a string (i.e. escape sequences are left undecoded) but with the surrounding quotes removed.

func (*Node) Type

func (n *Node) Type() string

Type returns the tree-sitter type string for a Node. E.g. string, object, call_expression. If the node is nil then an empty string is returned.

type Object

type Object struct {
	// contains filtered or unexported fields
}

Object is a wrapper about a N ode that contains a JS Object It has convenience methods to find properties of the object, convert it to other types etc.

func NewObject

func NewObject(n *Node, source []byte) Object

NewObject returns a jsluice Object for the given Node

func (Object) AsMap

func (o Object) AsMap() map[string]string

AsMap returns a Go map version of the object

func (Object) GetKeys

func (o Object) GetKeys() []string

GetKeys returns a slice of all keys in an object

func (Object) GetNode

func (o Object) GetNode(key string) *Node

GetNode returns the matching *Node for a given key

func (Object) GetNodeFunc

func (o Object) GetNodeFunc(fn func(key string) bool) *Node

GetNodeFunc is a general-purpose method for finding object properties by their key. The provided function is called with each key in turn. The first time that function returns true the corresponding *Node for that key is returned.

func (Object) GetNodeI

func (o Object) GetNodeI(key string) *Node

GetNodeI is like GetNode, but case-insensitive

func (Object) GetObject

func (o Object) GetObject(key string) Object

GetObject returns the property corresponding to the provided key as an Object

func (Object) GetString

func (o Object) GetString(key, defaultVal string) string

GetString returns the property corresponding to the provided key as a string, or the defaultVal if the key is not found.

func (Object) GetStringI

func (o Object) GetStringI(key, defaultVal string) string

GetStringI is like GetString, but the key is case-insensitive

func (Object) HasValidNode

func (o Object) HasValidNode() bool

HasValidNode returns true if the underlying node is a valid JavaScript object

type QueryResult

type QueryResult map[string]*Node

QueryResult is a map of capture names to the corresponding nodes that they matched

func NewQueryResult

func NewQueryResult(nodes ...*Node) QueryResult

NewQueryResult returns a QueryResult containing the provided *Nodes

func (QueryResult) Add

func (qr QueryResult) Add(n *Node)

Add accepts a *Node and adds it to the QueryResult, provided it has a valid CaptureName

func (QueryResult) Get

func (qr QueryResult) Get(captureName string) *Node

Get returns the corresponding *Node for the provided capture name, or nil if no such *Node exists

func (QueryResult) Has

func (qr QueryResult) Has(captureName string) bool

Has returns true if the QueryResult contains a *Node for the provided capture name

type Secret

type Secret struct {
	Kind     string   `json:"kind"`
	Data     any      `json:"data"`
	Filename string   `json:"filename,omitempty"`
	Severity Severity `json:"severity"`
	Context  any      `json:"context"`
}

A Secret represents any secret or otherwise interesting data found within a JavaScript file. E.g. an AWS access key.

type SecretMatcher

type SecretMatcher struct {
	Query string
	Fn    func(*Node) *Secret
}

A SecretMatcher is a tree-sitter query to find relevant nodes in the parse tree, and a function to inspect those nodes, returning any Secret that is found.

func AllSecretMatchers

func AllSecretMatchers() []SecretMatcher

AllSecretMatchers returns the default list of SecretMatchers

type Severity

type Severity string

Severity indicates how serious a finding is

const (
	SeverityInfo   Severity = "info"
	SeverityLow    Severity = "low"
	SeverityMedium Severity = "medium"
	SeverityHigh   Severity = "high"
)

type URL

type URL struct {
	URL         string            `json:"url"`
	QueryParams []string          `json:"queryParams"`
	BodyParams  []string          `json:"bodyParams"`
	Method      string            `json:"method"`
	Headers     map[string]string `json:"headers,omitempty"`
	ContentType string            `json:"contentType,omitempty"`

	// some description like locationAssignment, fetch, $.post or something like that
	Type string `json:"type"`

	// full source/content of the node; is optional
	Source string `json:"source,omitempty"`

	// the filename in which the match was found
	Filename string `json:"filename,omitempty"`
}

A URL is any URL found in the source code with accompanying details

type URLMatcher

type URLMatcher struct {
	Type string
	Fn   func(*Node) *URL
}

A URLMatcher has a type of thing it matches against (e.g. assignment_expression), and a function to actually do the matching and producing of the *URL

func AllURLMatchers

func AllURLMatchers() []URLMatcher

AllURLMatchers returns the detault list of URLMatchers

type UserPattern

type UserPattern struct {
	Name     string   `json:"name"`
	Key      string   `json:"key"`
	Value    string   `json:"value"`
	Severity Severity `json:"severity"`

	Object []*UserPattern `json:"object"`
	// contains filtered or unexported fields
}

A UserPattern represents a pattern that was provided by a when using the command-line tool. When using the package directly, a SecretMatcher can be created directly instead of creating a UserPattern

func (*UserPattern) MatchKey

func (u *UserPattern) MatchKey(in string) bool

MatchKey returns true if a pattern's key regex matches the supplied value, or if there is no key regex

func (*UserPattern) MatchValue

func (u *UserPattern) MatchValue(in string) bool

MatchValue returns true if a pattern's value regex matches the supplied value, or if there is no value regex.

func (*UserPattern) ParseRegex

func (u *UserPattern) ParseRegex() error

ParseRegex parses all of the user-provided regular expressions for a pattern into Go *regexp.Regexp types

func (*UserPattern) SecretMatcher

func (u *UserPattern) SecretMatcher() SecretMatcher

SecretMatcher returns a SecretMatcher based on the UserPattern, for use with (*Analyzer).AddSecretMatcher()

type UserPatterns

type UserPatterns []*UserPattern

UserPatterns is an alias for a slice of *UserPattern

func ParseUserPatterns

func ParseUserPatterns(r io.Reader) (UserPatterns, error)

ParseUserPatterns accepts an io.Reader pointing to a JSON user-pattern definition file, and returns a list of UserPatterns, and any error that occurred.

func (UserPatterns) SecretMatchers

func (u UserPatterns) SecretMatchers() []SecretMatcher

SecretMatchers returns a slice of SecretMatcher for use with (*Analyzer).AddSecretMatchers()

Directories

Path Synopsis
cmd
examples

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL