slackdump

package module
v1.3.5 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 24, 2022 License: GPL-3.0 Imports: 26 Imported by: 0

README

============
Slack Dumper
============

`Buy me a cup of tea`_

`Join the discussion`_

`Read the set up guide on Medium.com`_


Purpose: dump Slack messages, users and files using browser token and cookie.

Typical use scenarios:

* archive your private conversations from Slack when the administrator
  does not allow you to install applications OR you don't want to use 
  potentially privacy-violating third-party tools, 
* archive channels from Slack when you're on a free "no archive" subscription,
  so you don't lose valuable knowledge in those channels.

The library is "fit-for-purpose" quality and provided AS-IS.  I can't
say it's ready for production, as it lacks most of the unit tests, but
will do for ad-hoc use.

Slackdump accepts two types of input: 

#. the URL/link of the channel or thread, OR 
#. the ID of the channel.

.. contents::
   :depth: 2


Usage
=====

#. Download the archive from the Releases page for your operating system. (NOTE: **MacOS users** should download ``darwin`` release file).
#. Unpack
#. Change directory to where you have unpacked the archive.
#. Run ``./slackdump -h`` to see help.

How to authenticate
-------------------

Getting the authentication data
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

#. Open up your Slack *in browser* and login.

TOKEN
+++++

#. Open your browser's *Developer Console*.
#. Go to the Network tab
#. In the toolbar, switch to ``Fetch/XHR`` view.
#. Open any channel or private conversation in Slack.  You'll see a
   bunch of stuff appearing in Network panel.
#. In the list of requests, find the one starting with
   ``channels.prefs.get?``, click it and click on *Headers* tab in the
   opened pane.
#. Scroll down, until you see **Form Data**
#. Grab the **token:** value (it starts with ``xoxc-``), by right
   clicking the value and choosing "Copy Value".

**If you don't see the token value** in Google Chrome - switch to `Payload` tab,
your token is waiting for you there.

COOKIE
++++++

#. Switch to Application_ tab and select **Cookies** in the left
   navigation pane.
#. Find the cookie with the name "``d``".  That's right, just the
   letter "d".
#. Double-click the Value of this cookie.
#. Press Ctrl+C or Cmd+C to copy it's value to clipboard.
#. Save it for later.

Setting up the application
~~~~~~~~~~~~~~~~~~~~~~~~~~

#. Create the file named ``.env`` next to where the slackdump
   executable in any text editor.  Alternatively the file can
   be named ``secrets.txt`` or ``.env.txt``.
#. Add the token and cookie values to it. End result
   should look like this::

     SLACK_TOKEN=xoxc-<...elided...>
     COOKIE=12345472908twp<...elided...>

#. Save the file and close the editor.


Dumping conversations
---------------------

As it was already mentioned in the introduction, Slackdump supports
two ways of providing the conversation IDs that you want to save:

- **By ID**: it expects to see Conversation IDs.
- **By URL**: it expects to see URLs.  You can get URL by choosing
  "Copy Link" in the Slack on the channel or thread.

IDs or URLs can be passed on the command line or read from a file
(using the ``-i`` command line flag), in that file, every ID or URL
should be placed on a separate line.  Slackdump can automatically
detect if it's an ID or a URL.
  
Providing the list on the command line
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Firstly, dump the channel list to choose what you want to dump::

  slackdump -c

You will get the output resembling the following::

  2021/10/31 17:32:34 initializing...
  2021/10/31 17:32:35 retrieving data...
  2021/10/31 17:32:35 done
  ID           Arch  Saved  What
  CHXXXXXXX    -     -      #everything
  CHXXXXXXX    -     -      #everyone
  CHXXXXXXX    -     -      #random
  DHMAXXXXX    -     -      @slackbot
  DNF3XXXXX    -     -      @alice
  DLY4XXXXX    -     -      @bob

You'll need the value in the **ID** column.

To dump the channel, run the following command::

  slackdump <ID1> [ID2] ... [IDn]

By default, slackdump generates a json file with the convesation.  If
you want the convesation to be saved to a text file as well, use the
``-r text`` command line parameter.  See example below.

Example
+++++++

You want to dump conversations with @alice and @bob to text
files and save all the files (attachments) that you all shared in those
conversations::

  slackdump -r text -f DNF3XXXXX DLY4XXXXX https://....
       	    ━━━┯━━━ ━┯ ━━━┯━━━━━ ━━━┯━━━━━ ━━━━┯━━━━━┅┅ 
               │     │    │         │          │
               │     │    │         ╰─: @alice │
               │     │    ╰───────────: @bob   ┊
               │     ╰────────────────: save files
               ╰──────────────────────: text file output
           thread or conversation URL :────────╯

Conversation URL:
	       
To get the conversation URL link, use this simple trick that they
won't teach you at school:
	       
1. In Slack, right click on the conversation you want to dump (in the
   channel navigation pane on the left)
2. Choose "Copy link".

Thread URL:

1. In Slack, open the thread that you want to dump.
2. The thread opens to the right of the main conversation window
3. On the first message of the thread, click on three vertical dots menu (not sure how it's properly called), choose "Copy link"

Run the slackdump and provide the URL link as an input::

  slackdump -f  https://xxxxxx.slack.com/archives/CHM82GX00/p1577694990000400
            ━┯  ━━━━━━┯━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
	     │        ╰─────: URL of the thread
	     ╰──────────────: save files
	     

Reading data from the file
~~~~~~~~~~~~~~~~~~~~~~~~~~

Slackdump can read the list of the channels and URLs to dump from the
file.

1. Create the file that will contain all the necessary IDs and/or
   URLs, I'll use "links.txt" in the example.
2. Copy/paste all the IDs and URLs into that file, one per line.
3. Run slackdump with "-i" command line flag.  "-i" stands for
   "input"::

     slackdump -i links.txt
               ━━━━┯━━━━━━━
	           │        
		   ╰───────: instructs slackdump to use the file input
		   
Dumping users
-------------

To view all users, run::

  slackdump -u

By default, slackdump exports users in text format.  If you need to
output json, use ``-r json`` flag.

Dumping channels
----------------

To view channels, that are visible to your account, including group
conversations, archived chats and public channels, run::

  slackdump -c

By default, slackdump exports users in text format.  If you need to
output json, use ``-r json`` flag.

Command line flags reference
============================

In this section there will be some explanation provided for the
possible command line flags.

This doc may be out of date, to get the current command line flags
with a brief description, run::

  slackdump -h

Command line flags are described as of version ``v1.3.1``.

\-V
   print version and exit
\-c
   same as -list-channels

\-cookie
   along with ``-t`` sets the authentication values.  Can also be set
   using ``COOKIE`` environment variable.  Must contain the value of
   ``d=`` cookie.

\-cpr
   number of conversation items per request. (default 200).  This is
   the amount of individual messages that will be fetched from Slack
   API per single API request.

\-dl-retries number
   rate limit retries for file downloads. (default 3).  If the file
   download process hits the Slack Rate Limit reponse (HTTP ERROR
   429), slackdump will retry the download this number of times, for
   each file.

\-download
   enable files download.  If this flag is specified, slackdump will
   download all attachments, including the ones in threads.

\-download-workers
   number of file download worker threads. (default 4).  File download
   is performed with multiple goroutines.  This is the number of
   goroutines that will be downloading files.  You generally wouldn't
   need to modify this value.

\-dump-from
   timestamp of the oldest message to fetch from
   (i.e. 2020-12-31T23:59:59).  Allows setting the lower boundary of
   the timeframe for conversation dump.  This is useful when you don't
   need everything from the beginning of times.

\-dump-to
   timestamp of the latest message to fetch to
   (i.e. 2020-12-31T23:59:59).  Same as above, but for upper boundary.

\-f
   shorthand for -download (means "files")
   
\-ft
   output file naming template.  This parameter allows to define
   custom naming for output conversation files.

   It uses `Go templating`_ system.  Available template tags:

   :{{.ID}}: channel ID
   :{{.Name}}: channel Name
   :{{.ThreadTS}}: thread timestamp.  This tag can not be used on it's
      own, it must be combined with at least one of the above tags.

   You can use any of the standard template functions.  The default
   value for this parameter outputs the channelID as the filename.  For
   threads, it will use channelID-threadTS.

   Below are some of the common templates you could use.

   :Channel ID and thread:
      ::

	 {{.ID}}{{if .ThreadTS}}-{{.ThreadTS}}{{end}}
      
      The output file will look like "``C480129421.json``" for a
      channel if channel has ID=C480129421 and
      "``C4840129421-1234567890.123456.json``" for a thread.  This is
      the default template.

   :Channel Name and thread:

      ::

	 {{.Name}}{{if .ThreadTS}}({{.ThreadTS}}){{end}}
	 
      The output file will look like "``general.json``" for the channel and
      "``general(123457890.123456).json``" for a thread.


\-i
   specify the input file with Channel IDs or URLs to be used instead
   of giving the list on the command line, one per line.  Use "-" to
   read input from STDIN.  Example: ``-i my_links.txt``.
   
\-limiter-boost
   same as -t3-boost. (default 120)
   
\-limiter-burst
   same as -t3-burst. (default 1)

\-list-channels
   list channels (aka conversations) and their IDs for export.  The
   default output format is "text".  Use ``-r json`` to output
   as JSON.

\-list-users
   list users and their IDs.  The default output format is "text".
   Use ``-r json`` to output as JSON.

\-no-user-cache
   skip fetching users.  If this flag is specified, users won't be fetched
   during startup.  This disables the username resolving for the text
   output, I don't know why someone would use this flag, but it's there
   if you must.

\-npr
   chaNnels per request.  The amount of channels that will be fetched
   per API request when listing channels.  Setting it to higher value than
   100 bears no tangible outcome - Slack never returns more than 100 channels
   per request.  Greedy.

\-o
   output filename for users and channels.  Use '-' for standard
   output. (default "-")
   
\-r
   report (output) format.  One of 'json' or 'text'. For channels and
   users - will output only in the specified format.  For messages -
   if 'text' is requested, the text file will be generated along with
   json.

\-t
   Specify slack API token, (environment: ``SLACK_TOKEN``).
   This should be used along with ``--cookie`` flag.

\-t2-boost
   Tier-2 limiter boost in events per minute (affects users and
   channels APIs).

\-t2-burst
   Tier-2 limiter burst in events (affects users and
   channels APIs). (default 1)
   
\-t2-retries
   rate limit retries for channel listing. (affects users and channels APIs).
   (default 20)

\-t3-boost
   Tier-3 rate limiter boost in events per minute, will be added to
   the base slack tier event per minute value.  Affects conversation
   APIs. (default 120)
   
\-t3-burst
   allow up to N burst events per second.  Default value is
   safe. Affects conversation APIs (default 1)

\-t3-retries
   rate limit retries for conversation.  Affects conversation APIs. (default 3)
   
\-trace filename
   allows to specify the trace filename and enable tracing (optional).
   Use this flag if requested by developer.  The trace file does not contain any
   sensitive or PII.

\-u
   shorthand for -list-users.

\-user-cache-age
   user cache lifetime duration. Set this to 0 to disable
   cache. (default 4h0m0s) User cache is used to speedup consequent
   runs of slackdump.  Known issue - if you're changing slack
   workspace, make sure to delete the cache file, or set this to 0.

\-user-cache-file
   user cache filename. (default "users.json") See note
   for -user-cache-age above.

\-v
   verbose messages

As a library
============

Download:

.. code:: go

  go get github.com/rusq/slackdump

Use:

.. code:: go

  import "github.com/rusq/slackdump"

  func main() {
    sd, err := slackdump.New(os.Getenv("TOKEN"), os.Getenv("COOKIE"))
    if err != nil {
        // handle
    }
    // ... read the docs
  }

FAQ
===

:Q: **Do I need to create a Slack application?**

:A: No, you don't.  You need to grab that token and cookie from the
    browser Slack session.  See Usage_ at the top of the file.

:Q: **I'm getting "invalid_auth" error**

:A: Go get the new Cookie from the browser and Token as well.



Bulletin Board
--------------

Messages that were conveyed with the donations:

- 25/01/2022: Stay away from `TheSignChef.com`_, ya hear, they don't pay what
  they owe to their employees. 

.. _Application: https://stackoverflow.com/questions/12908881/how-to-copy-cookies-in-google-chrome
.. _`Buy me a cup of tea`: https://www.paypal.com/donate/?hosted_button_id=GUHCLSM7E54ZW
.. _`Join the discussion`: https://t.me/slackdump
.. _`Read the set up guide on Medium.com`: https://medium.com/@gilyazov/downloading-your-private-slack-conversations-52e50428b3c2
.. _`Go templating`: https://pkg.go.dev/html/template

..
  bulletin board links

.. _`TheSignChef.com`: https://www.glassdoor.com.au/Reviews/TheSignChef-com-Reviews-E793259.htm

Documentation

Overview

Package slackdump is a generated GoMock package.

Index

Constants

This section is empty.

Variables

View Source
var DefOptions = Options{
	DumpFiles:           false,
	Workers:             defNumWorkers,
	DownloadRetries:     3,
	Tier2Boost:          20,
	Tier2Burst:          1,
	Tier2Retries:        20,
	Tier3Boost:          120,
	Tier3Burst:          1,
	Tier3Retries:        3,
	ConversationsPerReq: 200,
	ChannelsPerReq:      100,
	UserCacheFilename:   "users.json",
	MaxUserCacheAge:     4 * time.Hour,
}

DefOptions is the default options used when initialising slackdump instance.

View Source
var ErrRetryFailed = errors.New("callback was not able to complete without errors within the allowed number of retries")

Functions

This section is empty.

Types

type Channel deprecated added in v1.1.0

type Channel = Conversation

Channel keeps the slice of messages.

Deprecated: use Conversation instead.

type Channels

type Channels []slack.Channel

Channels keeps slice of channels

func (Channels) ToText

func (cs Channels) ToText(sd *SlackDumper, w io.Writer) (err error)

ToText outputs Channels cs to io.Writer w in Text format.

type Conversation added in v1.3.0

type Conversation struct {
	Name     string    `json:"name"`
	Messages []Message `json:"messages"`
	// ID is the channel ID.
	ID string `json:"channel_id"`
	// ThreadTS is a thread timestamp.  If it's not empty, it means that it's a
	// dump of a thread, not a channel.
	ThreadTS string `json:"thread_ts,omitempty"`
}

Conversation keeps the slice of messages.

func (Conversation) IsThread added in v1.3.1

func (c Conversation) IsThread() bool

func (Conversation) String added in v1.3.0

func (c Conversation) String() string

func (Conversation) ToText added in v1.3.0

func (m Conversation) ToText(sd *SlackDumper, w io.Writer) (err error)

ToText outputs Messages m to io.Writer w in text format.

type Message added in v1.1.0

type Message struct {
	slack.Message
	ThreadReplies []Message `json:"slackdump_thread_replies,omitempty"`
}

Message is the internal representation of message with thread.

type MockReporter added in v1.3.0

type MockReporter struct {
	// contains filtered or unexported fields
}

MockReporter is a mock of Reporter interface.

func NewMockReporter added in v1.3.0

func NewMockReporter(ctrl *gomock.Controller) *MockReporter

NewMockReporter creates a new mock instance.

func (*MockReporter) EXPECT added in v1.3.0

EXPECT returns an object that allows the caller to indicate expected use.

func (*MockReporter) ToText added in v1.3.0

func (m *MockReporter) ToText(sd *SlackDumper, w io.Writer) error

ToText mocks base method.

type MockReporterMockRecorder added in v1.3.0

type MockReporterMockRecorder struct {
	// contains filtered or unexported fields
}

MockReporterMockRecorder is the mock recorder for MockReporter.

func (*MockReporterMockRecorder) ToText added in v1.3.0

func (mr *MockReporterMockRecorder) ToText(sd, w interface{}) *gomock.Call

ToText indicates an expected call of ToText.

type Option added in v1.1.0

type Option func(*Options)

Option is the signature of the option-setting function.

func DownloadFiles added in v1.3.0

func DownloadFiles(b bool) Option

DownloadFiles enables or disables the conversation/thread file downloads.

func MaxUserCacheAge added in v1.3.0

func MaxUserCacheAge(d time.Duration) Option

MaxUserCacheAge allows to set the maximum user cache age. If set to 0 - it will always use the API output, and never load cache.

func NumWorkers added in v1.1.1

func NumWorkers(n int) Option

NumWorkers allows to set the number of file download workers. n should be in range [1, NumCPU]. If not in range, will be reset to a defNumWorkers number, which seems reasonable.

func RetryDownloads added in v1.2.0

func RetryDownloads(attempts int) Option

RetryDownloads sets the number of attempts to download a file when getting rate limited.

func RetryThreads added in v1.2.0

func RetryThreads(attempts int) Option

RetryThreads sets the number of attempts when dumping conversations and threads, and getting rate limited.

func Tier2Boost added in v1.3.1

func Tier2Boost(eventsPerMin uint) Option

Tier2Boost allows to deliver a magic kick to the limiter, to override the base slack tier limits. The resulting events per minute will be calculated like this:

events_per_sec =  (<slack_tier_epm> + <eventsPerMin>) / 60.0

func Tier2Burst added in v1.3.1

func Tier2Burst(eventsPerSec uint) Option

Tier2Burst allows to set the limiter burst value.

func Tier3Boost added in v1.3.1

func Tier3Boost(eventsPerMin uint) Option

Tier3Boost allows to deliver a magic kick to the limiter, to override the base slack tier limits. The resulting events per minute will be calculated like this:

events_per_sec =  (<slack_tier_epm> + <eventsPerMin>) / 60.0

func Tier3Burst added in v1.3.1

func Tier3Burst(eventsPerSec uint) Option

Tier3Burst allows to set the limiter burst value.

func UserCacheFilename added in v1.3.0

func UserCacheFilename(s string) Option

UserCacheFilename allows to set the user cache filename.

type Options added in v1.3.1

type Options struct {
	DumpFiles           bool          // will we save the conversation files?
	Workers             int           // number of file-saving workers
	DownloadRetries     int           // if we get rate limited on file downloads, this is how many times we're going to retry
	Tier2Boost          uint          // tier-2 limiter boost
	Tier2Burst          uint          // tier-2 limiter burst
	Tier2Retries        int           // tier-2 retries when getting 429 on channels fetch
	Tier3Boost          uint          // tier-3 limiter boost allows to increase or decrease the slack tier req/min rate.  Affects all tiers.
	Tier3Burst          uint          // tier-3 limiter burst allows to set the limiter burst in req/sec.  Default of 1 is safe.
	Tier3Retries        int           // number of retries to do when getting 429 on conversation fetch
	ConversationsPerReq int           // number of messages we get per 1 API request. bigger the number, less requests, but they become more beefy.
	ChannelsPerReq      int           // number of channels to fetch per 1 API request.
	UserCacheFilename   string        // user cache filename
	MaxUserCacheAge     time.Duration // how long the user cache is valid for.
	NoUserCache         bool          // sometimes slack disallows user access, so we need a way to overcome that.
}

Options is the option set for the slackdumper.

type Reporter

type Reporter interface {
	ToText(sd *SlackDumper, w io.Writer) error
}

Reporter is an interface defining output functions

type SlackDumper

type SlackDumper struct {

	// Users contains the list of users and populated on NewSlackDumper
	Users     Users                  `json:"users"`
	UserIndex map[string]*slack.User `json:"-"`
	// contains filtered or unexported fields
}

SlackDumper stores basic session parameters.

func New

func New(ctx context.Context, token string, cookie string, opts ...Option) (*SlackDumper, error)

New creates new client and populates the internal cache of users and channels for lookups.

func NewWithOptions added in v1.3.1

func NewWithOptions(ctx context.Context, token string, cookie string, opts Options) (*SlackDumper, error)

func (*SlackDumper) DumpMessages

func (sd *SlackDumper) DumpMessages(ctx context.Context, channelID string) (*Conversation, error)

DumpMessages fetches messages from the conversation identified by channelID.

func (*SlackDumper) DumpMessagesInTimeframe added in v1.3.1

func (sd *SlackDumper) DumpMessagesInTimeframe(ctx context.Context, channelID string, oldest, latest time.Time) (*Conversation, error)

DumpMessagesInTimeframe dumps messages in the given timeframe between oldest and latest. If oldest or latest are zero time, they will not be accounted for. Having both oldest and latest as zero time, will make this function behave similar to DumpMessages.

func (*SlackDumper) DumpThread added in v1.0.3

func (sd *SlackDumper) DumpThread(ctx context.Context, channelID, threadTS string) (*Conversation, error)

func (*SlackDumper) DumpURL added in v1.3.0

func (sd *SlackDumper) DumpURL(ctx context.Context, slackURL string) (*Conversation, error)

DumpURL dumps messages from the slack URL, it supports conversations and individual threads.

func (*SlackDumper) DumpURLInTimeframe added in v1.3.1

func (sd *SlackDumper) DumpURLInTimeframe(ctx context.Context, slackURL string, oldest, latest time.Time) (*Conversation, error)

func (*SlackDumper) GetChannels

func (sd *SlackDumper) GetChannels(ctx context.Context, chanTypes ...string) (Channels, error)

GetChannels list all conversations for a user. `chanTypes` specifies the type of messages to fetch. See github.com/rusq/slack docs for possible values

func (*SlackDumper) GetUsers

func (sd *SlackDumper) GetUsers(ctx context.Context) (Users, error)

GetUsers retrieves all users either from cache or from the API.

func (*SlackDumper) IsUserDeleted added in v1.3.0

func (sd *SlackDumper) IsUserDeleted(id string) bool

IsUserDeleted checks if the user is deleted and returns appropriate value. It will assume user is not deleted, if it's not present in the user index.

func (*SlackDumper) SaveFileTo

func (sd *SlackDumper) SaveFileTo(ctx context.Context, dir string, f *slack.File) (int64, error)

SaveFileTo saves a single file to the specified directory.

func (*SlackDumper) SenderName added in v1.3.0

func (sd *SlackDumper) SenderName(msg *Message) string

SenderName returns username for the message

type Users

type Users []slack.User

Users is a slice of users.

func (Users) IndexByID added in v1.1.0

func (us Users) IndexByID() map[string]*slack.User

IndexByID returns the userID map to relevant *slack.User

func (Users) ToText

func (us Users) ToText(_ *SlackDumper, w io.Writer) error

ToText outputs Users us to io.Writer w in Text format

Directories

Path Synopsis
cmd
internal
app
mock_os
Package mock_os is a generated GoMock package.
Package mock_os is a generated GoMock package.
tracer
Package tracer is simple convenience wrapper around writing trace to a file.
Package tracer is simple convenience wrapper around writing trace to a file.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL