sqlserver

package
v1.0.1-0...-1811ebb Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Jun 28, 2023 License: Apache-2.0 Imports: 16 Imported by: 0

README

HarbourBridge: SQLServer-to-Spanner Evaluation and Migration

HarbourBridge is a stand-alone open source tool for Cloud Spanner evaluation, using data from an existing SQL Server database. This README provides details of the tool's SQL Server capabilities. For general HarbourBridge information see this README.

We currently do not support dump file mode for SQL Server. The only way to use HarbourBridge with SQL Server is connecting directly.

We currently do not support clustered columnstore indexes in spanner, therefore such indexes will be skipped during the migration.

Note that either 'sqlserver' or 'mssql' can be used as an identifier with the flag -source in the command line.

Example SQL Server Usage

HarbourBridge can be run directly on a SQL Server database (via go's database/sql package).

The following examples assume a harbourbridge alias has been setup as described in the Installing HarbourBridge section of the main README.

Directly connecting to a SQL Server database

In this case, HarbourBridge connects directly to the SQL Server database to retrieve table schema and data. Set the -source=sqlserver and corresponding source profile connection parameters host, port, user, dbName and password.

For example, to perform schema conversion, run

harbourbridge schema -source=sqlserver -source-profile="host=<>,port=<>,user=<>,dbName=<>"

Parameters port and password are optional. Port (port) defaults to 1433 for SQL Server source. Password can be provided at the password prompt.

Schema Conversion

SQL_Server_Type Spanner_Type
INT INT64
TINYINT INT64
SMALLINT INT64
BIGINT INT64
TIMESTAMP INT64
ROWVERSION INT64
BIT BOOL
FLOAT FLOAT64
REAL FLOAT64
NUMERIC NUMERIC
DECIMAL NUMERIC
MONEY NUMERIC
SMALLMONEY NUMERIC
CHAR STRING(1)
NCHAR STRING(N)
VARCHAR STRING(MAX)
NVARCHAR STRING(MAX)
TEXT STRING(MAX)
NTEXT STRING(MAX)
DATE DATE
DATETIME TIMESTAMP
DATETIME2 TIMESTAMP
SMALLDATETIME TIMESTAMP
DATETIMEOFFSET TIMESTAMP
TIME STRING(MAX)
BINARY BYTES
VARBINARY BYTES
IMAGE BYTES
XML STRING(MAX)
UNIQUEIDENTIFIER STRING(MAX)
SQL_VARIANT STRING(MAX)
HIERARCHYID STRING(MAX)
Spatial Geography Type STRING(MAX)
Spatial Geometry Types STRING(MAX)
Spatial datatypes

SQL Server supports SPATIAL GEOGRAPHY and SPATIAL GEOMETRY datatypes however, Spanner does not support spatial data types. These datatype are currently mapped to standard STRING Spanner datatype.

TIMESTAMP

The TIMESTAMP datatype (deprecated in the newer versions of SQL Server) was used for Row versioning. Hence, it is mapped to INT64 to keep it consistent with the ROWVERSION data type.

Storage Use

The tool maps several SQL Server types to Spanner types that use more storage. For example, SMALLINT is a two-byte integer, but it maps to Spanner's INT64, an eight-byte integer.

Primary Keys

Spanner requires primary keys for all tables. SQL Server recommends the use of primary keys for all tables, but does not enforce this. When converting a table without a primary key:

  • HarbourBridge will check for UNIQUE constraints on the table. If found, it will automatically pick any one of the unique constraints and convert it to a primary key.
  • If no UNIQUE constraints are present, HarbourBridge will create a new primary key column of type INT64. By default, the name of the new column is synth_id.
  • If there is already a column with that name, then a variation is used to avoid collisions.
NOT NULL Constraints

The tool preserves NOT NULL constraints. Note that Spanner does not require primary key columns to be NOT NULL. However, in SQL Server, a primary key is a combination of NOT NULL and UNIQUE, and so primary key columns from SQL Server will be mapped to Spanner columns that are both primary keys and NOT NULL.

Foreign Keys

The tool maps SQL Server foreign key constraints into Spanner foreign key constraints, and preserves constraint names where possible. Since Spanner doesn't support DELETE CASCADE and UPDATE CASCADE actions, we drop them.

Default Values

Spanner does not currently support default values. We drop these SQL Server features during conversion.

Secondary Indexes

The tool maps SQL Server non-clustered indexes to Spanner secondary indexes, and preserves constraint names where possible. Note that Spanner requires index key constraint names to be globally unique (within a database), but in SQL Server they only have to be unique for a table, so we add a uniqueness suffix to a name if needed. The tool also maps UNIQUE constraint into UNIQUE secondary index.

Other SQL Server features

SQL Server has many other features we haven't discussed, including functions, sequences, procedures, triggers and views which are currently not supported in Spanner. The tool does not support these and the relevant schema info is ignored during schema conversion.

Data Conversion

Strings, character set support and UTF-8

Spanner requires that STRING values be UTF-8 encoded. All Spanner functions and operators that act on STRING values operate on Unicode characters rather than bytes. Since we map many SQL Server types (including VARCHAR) to Spanner's STRING type, HarbourBridge is effectively a UTF-8 based tool. Internally, we use Go's string type, which supports UTF-8.

Timestamps and Timezones

There are some subtle differences in how timestamps are handled in SQL Server and Spanner.

During data conversion, SQL Server stores SMALLDATETIME, DATETIME2, DATETIME values in ISO 8601 format without offsets which is consistent with the way Spanner stores it. Hence, they are directly stored in Spanner. For DATETIMEOFFSET, the value contains an offset. During conversion, the timezone offset is taken into account and the corresponding UTC time is saved.

Documentation

Overview

Package sqlserver handles schema and data migrations from sqlserver.

Index

Constants

This section is empty.

Variables

This section is empty.

Functions

func ConvertData

func ConvertData(conv *internal.Conv, tableId string, colIds []string, srcSchema schema.Table, spSchema ddl.CreateTable, vals []string) (string, []string, []interface{}, error)

ConvertData maps the source DB data in vals into Spanner data, based on the Spanner and source DB schemas. Note that since entries in vals may be empty, we also return the list of columns (empty cols are dropped).

func ProcessDataRow

func ProcessDataRow(conv *internal.Conv, tableId string, colIds []string, srcSchema schema.Table, spSchema ddl.CreateTable, vals []string)

ProcessDataRow converts a row of data and writes it out to Spanner. srcTable and srcCols are the source table and columns respectively, and vals contains string data to be converted to appropriate types to send to Spanner. ProcessDataRow is only called in DataMode.

Types

type InfoSchemaImpl

type InfoSchemaImpl struct {
	DbName string
	Db     *sql.DB
}

func (InfoSchemaImpl) GetColumns

func (isi InfoSchemaImpl) GetColumns(conv *internal.Conv, table common.SchemaAndName, constraints map[string][]string, primaryKeys []string) (map[string]schema.Column, []string, error)

GetColumns returns a list of Column objects and names

func (InfoSchemaImpl) GetConstraints

func (isi InfoSchemaImpl) GetConstraints(conv *internal.Conv, table common.SchemaAndName) ([]string, map[string][]string, error)

GetConstraints returns a list of primary keys and by-column map of other constraints. Note: we need to preserve ordinal order of columns in primary key constraints. Note that foreign key constraints are handled in getForeignKeys.

func (InfoSchemaImpl) GetForeignKeys

func (isi InfoSchemaImpl) GetForeignKeys(conv *internal.Conv, table common.SchemaAndName) (foreignKeys []schema.ForeignKey, err error)

GetForeignKeys returns a list of all the foreign key constraints.

func (InfoSchemaImpl) GetIndexes

func (isi InfoSchemaImpl) GetIndexes(conv *internal.Conv, table common.SchemaAndName, colNameIdMap map[string]string) ([]schema.Index, error)

GetIndexes return a list of all indexes for the specified table.

func (InfoSchemaImpl) GetRowCount

func (isi InfoSchemaImpl) GetRowCount(table common.SchemaAndName) (int64, error)

GetRowCount with number of rows in each table.

func (InfoSchemaImpl) GetRowsFromTable

func (isi InfoSchemaImpl) GetRowsFromTable(conv *internal.Conv, tableId string) (interface{}, error)

GetRowsFromTable returns a sql Rows object for a table.

func (InfoSchemaImpl) GetTableName

func (isi InfoSchemaImpl) GetTableName(schema string, tableName string) string

GetTableName returns table name.

func (InfoSchemaImpl) GetTables

func (isi InfoSchemaImpl) GetTables() ([]common.SchemaAndName, error)

GetTables return list of tables in the selected database.

func (InfoSchemaImpl) GetToDdl

func (isi InfoSchemaImpl) GetToDdl() common.ToDdl

GetToDdl function below implement the common.InfoSchema interface.

func (InfoSchemaImpl) ProcessData

func (isi InfoSchemaImpl) ProcessData(conv *internal.Conv, tableId string, srcSchema schema.Table, commonColIds []string, spSchema ddl.CreateTable, additionalAttributes internal.AdditionalDataAttributes) error

ProcessDataRows performs data conversion for source database 'db'. For each table, we extract data using a "SELECT *" query, convert the data to Spanner data (based on the source and Spanner schemas), and write it to Spanner. If we can't get/process data for a table, we skip that table and process the remaining tables.

Note that the database/sql library has a somewhat complex model for returning data from rows.Scan. Scalar values can be returned using the native value used by the underlying driver (by passing *interface{} to rows.Scan), or they can be converted to specific go types. We choose to do all type conversions explicitly ourselves so that we can generate more targeted error messages: hence we pass *interface{} parameters to row.Scan.

func (InfoSchemaImpl) StartChangeDataCapture

func (isi InfoSchemaImpl) StartChangeDataCapture(ctx context.Context, conv *internal.Conv) (map[string]interface{}, error)

We leave the 2 functions below empty to be able to pass this as an infoSchema interface. We don't need these for now.

func (InfoSchemaImpl) StartStreamingMigration

func (isi InfoSchemaImpl) StartStreamingMigration(ctx context.Context, client *sp.Client, conv *internal.Conv, streamingInfo map[string]interface{}) error

type ToDdlImpl

type ToDdlImpl struct {
}

ToDdlImpl sql server specific implementation for ToDdl.

func (ToDdlImpl) ToSpannerType

func (tdi ToDdlImpl) ToSpannerType(conv *internal.Conv, spType string, srcType schema.Type) (ddl.Type, []internal.SchemaIssue)

ToSpannerType maps a scalar source schema type (defined by id and mods) into a Spanner type. This is the core source-to-Spanner type mapping. toSpannerType returns the Spanner type and a list of type conversion issues encountered.

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL