jsluice

command
v0.0.0-...-0ddfab1 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jan 10, 2024 License: MIT Imports: 15 Imported by: 0

README

# jsluice command-line tool

The `jsluice` command-line tool extracts URLs, paths, secrets, and other interesting bits
from JavaScript files.

Values are extracted based not just on how they *look*, but also based on how they are *used*.  

That means `jsluice` can find the path in this code:

```javascript
fetch('/api/users?id=' + userId + '&format=json', {
  method: "GET",
  headers: {
    "X-Env": "stage"
  }
})
```

But also the method, and headers:

```
▶ jsluice urls demo.js | jq
{
  "url": "/api/users?id=EXPR&format=json",
  "queryParams": ["id", "format"],
  "method": "GET",
  "headers": {
    "X-Env": "stage"
  },
  "type": "fetch"
}
```

Because `jsluice` is doing [static analysis](https://en.wikipedia.org/wiki/Static_program_analysis) it
can't know the value of that `userId` variable, but it *does* understand string concatenation. The value
of expressions like this are replaced with `EXPR` by default, but that can be changed with the
`-P`/`--placeholder` flag.

## Contents
* [Installation](#install)
* [Usage](#usage)
    * [Extracting URLs](#extracting-urls)
        * [Resolving Relative Paths](#resolving-relative-paths)
        * [Including Original Source](#including-original-source)
    * [Extracting Secrets](#extracting-secrets)
        * [Custom Secret Matchers](#custom-secret-matchers)
    * [Printing Syntax Trees](#printing-syntax-trees)
    * [Running Queries](#running-queries)
    * [Formatting JavaScript Source](#formatting-javascript-source)
    * [Using remote files over HTTP](#requesting-files-from-remote-hosts)
    * [Using WARC files](#using-warc-files)
    * [Getting help](#help)

## Install

To install `jsluice` you need [Go](https://go.dev/doc/install).

Once Go is installed and configured, run:

```
▶ go install github.com/BishopFox/jsluice/cmd/jsluice@latest
```

If everything worked correctly, you should be able to run `jsluice --help` and
see the [help output](#help).


## Usage

Provide `jsluice` with a mode, any options, and a list of JavaScript files (both local, and remote via HTTP based protocols):

```
jsluice <mode> [options] [file...]
```

You can also provide files one-per-line on `stdin`:

```
find . -name '*.js' | jsluice <mode> [options]
```

`jsluice` has five modes:
* `urls` - for extracting URLs and paths
* `secrets` - for finding secrets and so on
* `tree` - for printing syntax trees
* `query` - for running tree-sitter queries
* `format` - for formatting JavaScript source

Output is in [JSONL](https://jsonlines.org/) format. Piping `jsluice` to a tool
like [jq](https://jqlang.github.io/jq/) allows for human-readable formatting,
filtering and further processing.

### Extracting URLs

In `urls` mode, `jsluice` extracts URLs and paths from several different places:

* Assignments to document.location, val.href, val.src etc
* Calls to location.replace, window.open, and fetch
* Uses of XMLHttpRequest
* Calls to jQuery's $.get, $.post, and $.ajax
* Any string literal that contains something that looks like a URL

If you want to ignore string-literal matches you can use the `-I`/`--ignore-strings` flag.

When possible, HTTP methods, headers etc are also extracted.

Here's a call to [jQuery](https://jquery.com/)'s `$.ajax` as an example:

```javascript
$.ajax({
    method: "PUT",
    url: "/api/v1/posts",
    data:{ postId: 324 },
    headers: {
        "Content-Type": "application/json",
        "x-backend": "prod"
    }},
    function(data, status){
        location.href = data.redirect;
    }
)
```

And the output from `jsluice`:

```
▶ jsluice urls jquery.js | jq
{
  "url": "/api/v1/posts",
  "queryParams": [],
  "bodyParams": [
    "postId"
  ],
  "method": "PUT",
  "headers": {
    "Content-Type": "application/json",
    "x-backend": "prod"
  },
  "type": "$.ajax",
  "filename": "jquery.js"
}
```

#### Resolving Relative Paths

Relative paths can be resolved using a base URL provided with the `-R`/`--resolve-paths` flag.

```
▶ cat location.js
document.location = '../../guestbook.html'

▶ jsluice urls location.js -I -R https://example.com/~tom/photos/2003/ | jq
{
  "url": "https://example.com/~tom/guestbook.html",
  "queryParams": [],
  "bodyParams": [],
  "method": "GET",
  "type": "locationAssignment",
  "filename": "location.js"
}
```

#### Including Original Source

Sometimes it's useful to be able to see the complete source code that a URL was extracted from.
Using the `-S`/`--include-source` flag adds a `source` field to the results containing that source code:

```
▶ jsluice urls location.js -I -S | jq
{
  "url": "../../guestbook.html",
  "queryParams": [],
  "bodyParams": [],
  "method": "GET",
  "type": "locationAssignment",
  "source": "document.location = '../../guestbook.html'",
  "filename": "testdata/relative-location.js"
}
```

### Extracting Secrets

The `secrets` mode is for extracting API keys, passwords, and other interesting bits of data.

There are built-in extractors for:

* AWS keys
* GCP keys
* GitHub keys
* Firebase configurations

That's not very many, so you can supply your own in a file specified with the `-p`/`--patterns` flag.

Here's an example of some JavaScript that contains an AWS key:

```javascript
var config = {
    bucket: "examplebucket",
    awsKey: "AKIAIOSFODNN7EXAMPLE",
    awsSecret: "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    server: "someserver.example.com"
};
```

And the output of `jsluice secrets` when run against that file:

```
▶ jsluice secrets awskey.js | jq
{
  "kind": "AWSAccessKey",
  "data": {
    "key": "AKIAIOSFODNN7EXAMPLE",
    "secret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY"
  },
  "filename": "awskey.js",
  "severity": "high",
  "context": {
    "awsKey": "AKIAIOSFODNN7EXAMPLE",
    "awsSecret": "wJalrXUtnFEMI/K7MDENG/bPxRfiCYEXAMPLEKEY",
    "bucket": "examplebucket",
    "server": "someserver.example.com"
  }
}
```

The key and associated secret are put in the `data` field with predictable names to ease
the automation of, for example, checking the validity of found secrets.

The entire object in which the secret was found is included in the `content` field.

#### Custom Secret Matchers

A JSON file containing an array of pattern objects can be supplied using the `-p`/`--patterns` flag.

Here's an example of a basic patterns file:

```json
[ 
  { 
    "name": "base64", 
    "value": "(eyJ|YTo|Tzo|PD[89]|rO0)[%a-zA-Z0-9+/]+={0,2}", 
    "severity": "low" 
  }, 
  { 
    "name": "genericSecret", 
    "key": "(secret|private|key)", 
    "value": "[%a-zA-Z0-9+/]+" 
  },
  {
    "name": "firebaseConfig",
    "severity": "high",
    "object": [
      {"key": "apiKey", "value": "^AIza.+"},
      {"key": "authDomain"},
      {"key": "projectId"},
      {"key": "storageBucket"}
    ]
  }
] 
```

Each pattern can have the following fields:

* `name`, which is used in the output
* `severity`, which should be one of `info`, `low`, `medium`, or `high`
* `value`, a regular expression to match against string values
* `key`, a regular expression to match against key names
* `object`, an array of patterns to match against the keys and values of an entire object

All regular expressions use the [Go regex syntax](https://pkg.go.dev/regexp/syntax).

Here's a, somewhat silly, example JavaScript file to run the patterns file against:

```javascript
function getConfig(){ 
    let config = { 
        randomStr: "abc123xyz256", 
        secret: "I quite like PHP", 
    } 
    return "eyJsb2wiOiAic29tZSBKU09OISIsICJjb3VudCI6IDEyM30K" 
} 
```

Running `jsluice secrets` using the above patterns file (saved as `patterns.json`):

```
▶ jsluice secrets -p patterns.json simple-b64.js | jq
{
  "kind": "base64",
  "data": {
    "match": "eyJsb2wiOiAic29tZSBKU09OISIsICJjb3VudCI6IDEyM30K"
  },
  "filename": "simple-b64.js",
  "severity": "low",
  "context": null
}
{
  "kind": "genericSecret",
  "data": {
    "key": "secret",
    "value": "I quite like PHP"
  },
  "filename": "simple-b64.js",
  "severity": "info",
  "context": {
    "randomStr": "abc123xyz256",
    "secret": "I quite like PHP"
  }
}
```

Note that the `base64` matcher worked as expected, but the `genericSecret` matcher
returned a rather different sort of secret than expected. That's because the regular
expression lacks [anchors](https://www.regular-expressions.info/anchors.html):

```
[%a-zA-Z0-9+/]+
```

If you wanted the match against all of the value, the regex could be changed to:

```
^[%a-zA-Z0-9+/]+$
```

### Printing Syntax Trees

The `tree` mode prints a textual representation of the syntax tree for each JavaScript file.
This is especially helpful when [writing queries](#running-queries).

The output can be quite long, so here's a tiny example program:

```javascript
console.log("Hello, world!")
```

And the output of `jsluice tree`:

```
▶ jsluice tree hello.js
hello.js:
program
  expression_statement
    call_expression
      function: member_expression
        object: identifier (console)
        property: property_identifier (log)
      arguments: arguments
        string ("Hello, world!")
```

### Running Queries

The `query` mode lets you run [Tree-sitter](https://tree-sitter.github.io/tree-sitter/) queries against JavaScript files.
The query syntax is fully documented [here on the Tree-sitter project site](https://tree-sitter.github.io/tree-sitter/using-parsers#query-syntax).

Just about the most simple query you could run extracts all of the string literals from the input files.

Here's an example file to try it with:

```javascript
const config = {
    stage: false,
    server: "example.com",
    ttl: 3600,
    dns: ["1.1.1.1", "8.8.8.8"],
    paths: {
        "home": "/",
        "blog": "/blog"
    }
}
```

And how to run the query:

```
▶ jsluice query -q '(string) @str' config.js
"example.com"
"1.1.1.1"
"8.8.8.8"
"home"
"/"
"blog"
"/blog"
```

The `@str` part of the query identifies which part of the query should be extracted.
In this case there is only one thing to match in the query, but it is still required.

`jsluice` tries to make the output valid JSONL where possible, and because it understands
objects, arrays, strings, etc: it's possible to get JSON represenations of those things
as output:

```
▶ jsluice query -q '(object) @match' config.js | jq
{
  "dns": [
    "1.1.1.1",
    "8.8.8.8"
  ],
  "paths": {
    "blog": "/blog",
    "home": "/"
  },
  "server": "example.com",
  "stage": false,
  "ttl": 3600
}
{
  "blog": "/blog",
  "home": "/"
}
```

If you don't want that to happen, you can use the `-r`/`--raw-output` flag.

### Formatting JavaScript Source

The `format` mode uses [jsbeautifier-go](https://github.com/ditashi/jsbeautifier-go) to format JavaScript source code:

```
▶ cat testdata/location.min.js
function goToLogin(){location.href="/login/"+document.location.hash.substring(1)} let logout=()=>{document.location.replace("/logout")}

▶ jsluice format testdata/location.min.js
function goToLogin() {
    location.href = "/login/" + document.location.hash.substring(1)
}
let logout = () => {
    document.location.replace("/logout")
}
```

### Requesting files from remote hosts:
`jsluice` will detect when an argument is passed to the tool that begins with `http://` or `https://`. These arguments will be used to retrieve the associated files, and work on them in the same process as the local files.

This means that URLs can be specified aswell as the local files.

```
▶ jsluice urls demo.js https://example.com/jquery.js | jq
{
  "url": "/api/users?id=EXPR&format=json",
  "queryParams": ["id", "format"],
  "method": "GET",
  "headers": {
    "X-Env": "stage"
  },
  "type": "fetch"
}
{
  "url": "/api/v1/posts",
  "queryParams": [],
  "bodyParams": [
    "postId"
  ],
  "method": "PUT",
  "headers": {
    "Content-Type": "application/json",
    "x-backend": "prod"
  },
  "type": "$.ajax",
  "filename": "jquery.js"
}
```

### Using WARC files

When the `-w`/`--warc` flag is specified, `jsluice` treats the input files as
[WARC](https://iipc.github.io/warc-specifications/specifications/warc-format/warc-1.1-annotated/) files.

```
▶ jsluice urls --warc example.warc.gz | jq
{
  "url": "/blog/admin.php?redirect=/login",
  "queryParams": [
    "redirect"
  ],
  "bodyParams": [],
  "method": "GET",
  "type": "location.replace",
  "filename": "https://example.com/blog/"
}
```

### Help

You can see the `jsluice` help output with the `-h`/`--help` flag.

```
▶ jsluice --help
jsluice - Extract URLs, paths, and secrets from JavaScript files

Usage:
  jsluice <mode> [options] [file...]

Modes:
  urls      Extract URLs and paths
  secrets   Extract secrets and other interesting bits
  tree      Print syntax trees for input files
  query     Run tree-sitter a query against input files

Global options:
  -c, --concurrency int        Number of files to process concurrently (default 1)
  -C, --cookie string          Cookies to use when making requests to the specified HTTP based arguments
  -H, --header string          Headers to use when making requests to the specified HTTP based arguments (can be specified multiple times)
  -P, --placeholder string     Set the expression placeholder to a custom string (default 'EXPR')
  -w, --warc                   Treat the input files as WARC (Web ARChive) files

URLs mode:
  -I, --ignore-strings         Ignore matches from string literals
  -S, --include-source         Include the source code where the URL was found
  -R, --resolve-paths <url>    Resolve relative paths using the absolute URL provided

Secrets mode:
  -p, --patterns <file>        JSON file containing user-defined secret patterns to look for

Query mode:
  -q, --query <query>          Tree sitter query to run; e.g. '(string) @matches'
  -r, --raw-output             Do not JSON-encode query output

Examples:
  jsluice urls -C 'auth=true; user=admin;' -H 'Specific-Header-One: true' -H 'Specific-Header-Two: false' local_file.js https://remote.host/example.js
  jsluice query -q '(object) @m' one.js two.js
  find . -name *.js' | jsluice secrets -c 5 --patterns=apikeys.json
```

Documentation

The Go Gopher

There is no documentation for this package.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL