spsync

package module
v0.0.0-...-745bdff Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Mar 24, 2024 License: MIT Imports: 7 Imported by: 6

README

spsync

Go library for robust cloud native SharePoint Lists synchronization or backup

Gosip

The library implements optimal synchronization of SharePoint Lists taking care of pagination, large lists thresholds, incremental sync via changes API, deletion and items restore tracking, and more.

It is designed to run as a scheduled job within serverless scenarios. The incremental mode could be adapted for Webhooks signals.

In a scheduler mode, let's assume the Azure Functions timer job, after single or multiple executions (which could be terminated with a timeout) full sync is completed, then incremental mode is effectively catching up with only changed items in seconds.

Here are the design and core principles of how it works.

Entity sync metadata persistent state

Entity Change Token Page Token Mode Stage Sync Date Failed Sessions
List/ListA 1;3;b9...6db9 Paged=TRUE&p_ID=6387 Full Upsert 0
Synchronization Modes
  • Full
  • Incr
Synchronization Stages
  • Upsert
  • Delete

Full synchronization mode

Full synchronization mode condition(s)

Mode == Full or a Change token is not set.

Full synchronization mode logic
  • On a blank session:

    • save current change token and timestamp (sync start not completion)
    • change mode to Full
  • On a continue session:

    • start from a previous page skip token (if any)
  • Default sort (by ID ascending)

  • Paginate by N items

  • Save a previous page skip token (on page successfully processed)

  • Bulk processing upserts (in a custom hook handler)

  • Control time (for serverless scenarios)

  • Fail aware and continue logic (continue from page token)

  • On completion:

    • update sync date after a successful session
    • clear session page skip token
    • change mode to Incr
  • Tracking deletions

    • Paginate through all items with only ID selected
    • Using max page size of 5000 items
    • Detect gaps in ID sequence
    • Bulk process deletions (in a custom hook handler)

Incremental synchronization mode

Incremental synchronization mode condition(s)

Mode == Incr and a Change token is not empty.

Incremental synchronization mode logic
  • Use Change Token and change API
  • Track: Adds, Updates, Deletes, Restores
  • Paginate through by N items
  • Save a previous page skip token (on page successfully processed)
  • Process changes list
    • For added/updated/restored items: request by IDs, and process (in a custom hook handler)
    • Bulk process deletions (in a custom hook handler)
  • Save the latest change token from a previously processed page
  • Control time (for serverless scenarios)
  • By design, a failure is amended on a next incremental run
    • however, failed sessions should be tracked
    • with too many fails - create an incident
  • On completion:
    • update sync date after a successful session
    • reset failed sessions counter

Usage sample

package main

import (
	"context"
	"fmt"
	"os"

	"github.com/joho/godotenv"
	"github.com/koltyakov/gosip"
	"github.com/koltyakov/gosip/api"
	strategy "github.com/koltyakov/gosip/auth/saml"
	"github.com/koltyakov/gosip/cpass"
	"github.com/koltyakov/spsync"
)

func main() {
	// Load environment variables needed for sync processing
	_ = godotenv.Load()

	ctx := context.Background()

	// Setup sync configuration per a specific entity.
	// Usually you need to sync multiple Lists so most of a logic from this sample
	// iterates for synced entities.
	// In a serverless scenarios, sync session timeout is controlled and terminated
	// by the caller after timeout. It's OK, the resumed session is designed
	// to continue since a previous successful incremental state.

	opts := &spsync.Options{
		// Create and bind SharePoint API client, see more https://go.spflow.com
		// For multiple entities sync initiate the client externally
		SP: NewSP(),
		// Pass current state from a persistent storage (e.g. database or SharePoint list)
		State: &spsync.State{
			EntID:    "Lists/MyList",
			SyncMode: spsync.Full,
		},
		// Pass configuration for a specific entity, usually stored in config file
		Ent: &spsync.Ent{
			Select: []string{"Id", "Title"},
		},
		// Provide a handler to deal with created and updated items
		Upsert: func(ctx context.Context, items []spsync.Item) error {
			// Implement your logic here for your target system:
			// Bulk create or update if a target system supports batch processing
			// alternatively, check if an item's Id exists and update otherwise create
			fmt.Printf("Upsert %d items\n", len(items))
			return nil
		},
		// Provide a handler to deal with deleted items
		Delete: func(ctx context.Context, ids []int) error {
			// Implement your logic here for your target system:
			// Bulk delete if a target system supports batch processing
			// alternatively, delete item by Id and ignore NotFound error types
			fmt.Printf("Delete %d items\n", len(ids))
			return nil
		},
		// Save the updated entity sync state to a persistent storage
		Persist: func(s *spsync.State) {
			// The state is used to resume sync session from a previous state.
			// Even when a sync session ended up with errors we need to save it.
			// Some scenarious, e.g. sync in serverless jobs might be designed
			// to terminate in 10-15 minutes and resume on a next run.
		},
		// Hook to sync events and use logging middleware of your choice
		Events: &spsync.Events{},
	}

	// Run sync session
	state, err := spsync.Run(ctx, opts)
	if err != nil {
		state.Fails += 1
		// Persist state here as well
		fmt.Println(err)
	}
}

// NewSP constructs SharePoint authenticated API client instance
func NewSP() *api.SP {
	// Simplest strategy for testing
	// for prod the AzureAD auth is recommended:
	// https://go.spflow.com/auth/strategies/azure-certificate-auth

	// Create .env with the following variables: SP_SITE_URL, SP_USERNAME, SP_PASSWORD
	// or export the variables in your shell

	c := cpass.Cpass("")
	password, _ := c.Decode(os.Getenv("SP_PASSWORD"))

	auth := &strategy.AuthCnfg{
		SiteURL:  os.Getenv("SP_SITE_URL"),
		Username: os.Getenv("SP_USERNAME"),
		Password: password,
	}

	client := &gosip.SPClient{AuthCnfg: auth}
	return api.NewSP(client)
}

Powered by gosip.

Documentation

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

This section is empty.

Types

type DeleteHandler

type DeleteHandler func(ctx context.Context, ids []int) error

type Ent

type Ent struct {
	// OData fields to select from a list
	Select []string
	// OData props to expand in a list
	Expand []string
	// Custom .Top() value, default is 1000
	Top int
}

Ent List entity configuration

type Events

type Events struct {
	FullSyncStarted  func(entity string, isBlank bool)
	FullSyncFinished func(entity string, isBlank bool)
	FullSyncRequest  func(entity string, query string)
	IncrSyncStarted  func(entity string)
	IncrSyncFinished func(entity string)
	IncrSyncRequest  func(entity string, startToken string, endToken string)
}

Events hook events during sync session usefull for logging in unopinionated way

type Item

type Item struct {
	// SharePoint List item ID
	ID int
	// Created at timestamp
	Created time.Time
	// Modified at timestamp
	Modified time.Time
	// Item version
	Version int
	// SharePoint List item metadata
	Data map[string]interface{}
}

Item abstract item structure

type Mode

type Mode string

Mode synchronization modes enum

const (
	// Full sync mode
	Full Mode = "Full"
	// Incremental sync mode
	Incr Mode = "Incr"
)

type Options

type Options struct {
	SP      *api.SP
	State   *State
	Ent     *Ent
	Upsert  UpsertHandler
	Delete  DeleteHandler
	Events  *Events
	Persist func(s *State)
}

Options describes sync session and actions

type State

type State struct {
	// SharePoint List URI, e.g. "Lists/ListA"
	EntID string `json:"entId"`
	// Sync mode: Full or Incr
	SyncMode Mode `json:"syncMode"`
	// Last successful sync timestamp
	SyncDate time.Time `json:"syncDate"`
	// Multistaged sync step
	SyncStage string `json:"syncStage"`
	// List change API token
	ChangeToken string `json:"changeToken"`
	// List pagination skip token
	PageToken string `json:"pageToken"`
	// Number of failed in a row sync sessions
	Fails int `json:"fails"`
}

State structure for defining fault tolerant syncronization state

func Run

func Run(ctx context.Context, o *Options) (*State, error)

Run executes sync session per entity

type UpsertHandler

type UpsertHandler func(ctx context.Context, items []Item) error

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL