logseq-sync

module

v0.0.0-...-b1f9b26 Latest Latest Go to latest Published: Feb 3, 2024 License: MIT

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/bcspragu/logseq-sync

Links

Open Source Insights

README ¶

Logseq Sync

An attempt at an open-source version of the Logseq Sync service, intended for individual, self-hosted use.

It's vaguely functional (see What Works? below), but decidedly pre-alpha software. Definitely don't try to point a real, populated Logseq client at it, I have no idea what will happen.

What's Done/Exists?

Right now, the repo contains (in cmd/server) a mostly implemented version of the Logseq API, including credentialed blob uploads, signed blob downloads, a SQLite database for persistence, and most of the API surface at least somewhat implemented.

Currently, running any of this requires a modified version of the Logseq codebase (here), and the @logseq/rsapi package (here)

On that note, many thanks to the Logseq Team for open-sourcing rsapi recently, it made this project significantly easier to work with.

What Works?

With a modified Logseq, you can use the local server to

Create a graph
Upload (passphrase-encrypted) encryption keys
Get temporary AWS credentials to upload your encrypted files to your private S3 bucket
Upload your encrypted files

And that's basically the full end-to-end flow! The big remaining things are:

Implement the WebSockets protocol
- There's some documentation for it
Figure out how/when to increment the transaction (tx) counter

API Documentation

There's some documentation for the API in docs/API.md. This is the area I could benefit the most from having more information/help on, see Contributing below

Open Questions

S3 API

The real Logseq Sync API gets temp S3 credentials and uploads files direct to S3. I haven't looked closely enough to see if we can swap this out for something S3-compatible like s3proxy or MinIO, see #2 for a bit more discussion.

Currently, amazonaws.com is hardcoded in the client, so that'll be part of a larger discussion on how to make all of this configurable in the long run.

Associated Changes to Logseq

Being able to connect to a self-hosted sync server requires some changes to Logseq as well, namely to specify where your sync server can be accessed. Those changes are in a rough, non-functional state here: https://github.com/logseq/logseq/compare/master...bcspragu:logseq:brandon/settings-hack

Adding a database migration

The self-hosted sync backend has rudimentary support for persistence in a SQLite database. We use sqlc to do Go codegen for SQL queries, and Atlas to manage generating diffs.

The process for changing the database schema looks like:

Update db/sqlite/schema.sql with your desired changes
Run ./scripts/add_migration.sh <name of migration> to generate the relevant migration
Run ./scripts/apply_migrations.sh to apply the migrations to your SQLite database

Why do it this way?

With this workflow, the db/sqlite/migrations/ directory is more or less unused by both sqlc and the actual server program. The reason it's structured this way is to keep a more reviewable audit log of the changes to a database, which a single schema.sql doesn't give you.

Contributing

If you're interested in contributing, thanks! I sincerely appreciate it. There's a few main avenues for contributions:

Getting official buy-in from Logseq

The main blocker right now is getting buy-in from the Logseq team, as I don't want to do the work to add self-hosting settings to the Logseq codebase if they won't be accepted upstream. I've raised the question on the Logseq forums, as well as in a GitHub Discussion on the Logseq repo, but have received no official response.

Understanding/documenting the API

One area where I would love help is specifying the official API more accurately. My API docs are based on a dataset of one, my own account. So there are areas that are underspecified, unknown, or where I just don't understand the flow. Any help there would be great!

Specifically, I'd like to understand:

The details of the WebSocket protocol (doc started here), and
How and when to update the transaction counter, tx in the API

Debugging S3 signature issues

I believe there's a bug (filed upstream, initially here) in the s3-presign crate used by Logseq's rsapi component, which handles the actual sync protocol bits (encryption, key generation, S3 upload, etc).

The bug causes flaky uploads with self-hosted, AWS-backed (i.e. S3 + STS) servers, but I haven't had the time to investigate the exact root cause. The source code for the s3-presign crate is available here, the GitHub repo itself doesn't appear to be public.

Directories ¶

Path	Synopsis
blob Package blob defines domain types for interacting with blob storage.	Package blob defines domain types for interacting with blob storage.
awsblob Package awsblob provides the blob operations needed by Logseq Sync, backed by AWS's S3.	Package awsblob provides the blob operations needed by Logseq Sync, backed by AWS's S3.
cmd
server Command server will eventually feature a self-contained Logseq Sync service.	Command server will eventually feature a self-contained Logseq Sync service.
tools/signtest Command signtest is a quick tool for testing the generation and use of presigned S3 upload URLs.	Command signtest is a quick tool for testing the generation and use of presigned S3 upload URLs.
db Package db contains domain types for working with persisted Logseq data.	Package db contains domain types for working with persisted Logseq data.
mem Package mem implements an in-memory version of our DB interface, for quick iteration and local testing.	Package mem implements an in-memory version of our DB interface, for quick iteration and local testing.
sqlite Package sqlite provides a thin wrapper over the sqlc-generated SQLite wrapper to adhere to our server's DB interface.	Package sqlite provides a thin wrapper over the sqlc-generated SQLite wrapper to adhere to our server's DB interface.
sqlite/sqlitedb
httperr

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL