sqlserver

package

v1.0.1-0...-1811ebb Latest Latest Go to latest Published: Jun 28, 2023 License: Apache-2.0 Imports: 16 Imported by: 0

Details

Valid go.mod file

The Go module system was introduced in Go 1.11 and is the official dependency management solution for Go.
Redistributable license

Redistributable licenses place minimal restrictions on how software can be used, modified, and redistributed.
Tagged version

Modules with tagged versions give importers more predictable builds.
Stable version

When a project reaches major version v1 it is considered stable.
Learn more about best practices

Repository

github.com/cloudspannerecosystem/harbourbridge

README ¶

HarbourBridge: SQLServer-to-Spanner Evaluation and Migration

HarbourBridge is a stand-alone open source tool for Cloud Spanner evaluation, using data from an existing SQL Server database. This README provides details of the tool's SQL Server capabilities. For general HarbourBridge information see this README.

We currently do not support dump file mode for SQL Server. The only way to use HarbourBridge with SQL Server is connecting directly.

We currently do not support clustered columnstore indexes in spanner, therefore such indexes will be skipped during the migration.

Note that either 'sqlserver' or 'mssql' can be used as an identifier with the flag -source in the command line.

Example SQL Server Usage

HarbourBridge can be run directly on a SQL Server database (via go's database/sql package).

The following examples assume a harbourbridge alias has been setup as described in the Installing HarbourBridge section of the main README.

Directly connecting to a SQL Server database

In this case, HarbourBridge connects directly to the SQL Server database to retrieve table schema and data. Set the -source=sqlserver and corresponding source profile connection parameters host, port, user, dbName and password.

For example, to perform schema conversion, run

harbourbridge schema -source=sqlserver -source-profile="host=<>,port=<>,user=<>,dbName=<>"

Parameters port and password are optional. Port (port) defaults to 1433 for SQL Server source. Password can be provided at the password prompt.

Schema Conversion

SQL_Server_Type	Spanner_Type
INT	INT64
TINYINT	INT64
SMALLINT	INT64
BIGINT	INT64
TIMESTAMP	INT64
ROWVERSION	INT64
BIT	BOOL
FLOAT	FLOAT64
REAL	FLOAT64
NUMERIC	NUMERIC
DECIMAL	NUMERIC
MONEY	NUMERIC
SMALLMONEY	NUMERIC
CHAR	STRING(1)
NCHAR	STRING(N)
VARCHAR	STRING(MAX)
NVARCHAR	STRING(MAX)
TEXT	STRING(MAX)
NTEXT	STRING(MAX)
DATE	DATE
DATETIME	TIMESTAMP
DATETIME2	TIMESTAMP
SMALLDATETIME	TIMESTAMP
DATETIMEOFFSET	TIMESTAMP
TIME	STRING(MAX)
BINARY	BYTES
VARBINARY	BYTES
IMAGE	BYTES
XML	STRING(MAX)
UNIQUEIDENTIFIER	STRING(MAX)
SQL_VARIANT	STRING(MAX)
HIERARCHYID	STRING(MAX)
Spatial Geography Type	STRING(MAX)
Spatial Geometry Types	STRING(MAX)

`Spatial datatypes`

SQL Server supports SPATIAL GEOGRAPHY and SPATIAL GEOMETRY datatypes however, Spanner does not support spatial data types. These datatype are currently mapped to standard STRING Spanner datatype.

`TIMESTAMP`

The TIMESTAMP datatype (deprecated in the newer versions of SQL Server) was used for Row versioning. Hence, it is mapped to INT64 to keep it consistent with the ROWVERSION data type.

Storage Use

The tool maps several SQL Server types to Spanner types that use more storage. For example, SMALLINT is a two-byte integer, but it maps to Spanner's INT64, an eight-byte integer.

Primary Keys

Spanner requires primary keys for all tables. SQL Server recommends the use of primary keys for all tables, but does not enforce this. When converting a table without a primary key:

HarbourBridge will check for UNIQUE constraints on the table. If found, it will automatically pick any one of the unique constraints and convert it to a primary key.
If no UNIQUE constraints are present, HarbourBridge will create a new primary key column of type INT64. By default, the name of the new column is synth_id.
If there is already a column with that name, then a variation is used to avoid collisions.

NOT NULL Constraints

The tool preserves NOT NULL constraints. Note that Spanner does not require primary key columns to be NOT NULL. However, in SQL Server, a primary key is a combination of NOT NULL and UNIQUE, and so primary key columns from SQL Server will be mapped to Spanner columns that are both primary keys and NOT NULL.

Foreign Keys

The tool maps SQL Server foreign key constraints into Spanner foreign key constraints, and preserves constraint names where possible. Since Spanner doesn't support DELETE CASCADE and UPDATE CASCADE actions, we drop them.

Default Values

Spanner does not currently support default values. We drop these SQL Server features during conversion.

Secondary Indexes

The tool maps SQL Server non-clustered indexes to Spanner secondary indexes, and preserves constraint names where possible. Note that Spanner requires index key constraint names to be globally unique (within a database), but in SQL Server they only have to be unique for a table, so we add a uniqueness suffix to a name if needed. The tool also maps UNIQUE constraint into UNIQUE secondary index.

Other SQL Server features

SQL Server has many other features we haven't discussed, including functions, sequences, procedures, triggers and views which are currently not supported in Spanner. The tool does not support these and the relevant schema info is ignored during schema conversion.

Data Conversion

Strings, character set support and UTF-8

Spanner requires that STRING values be UTF-8 encoded. All Spanner functions and operators that act on STRING values operate on Unicode characters rather than bytes. Since we map many SQL Server types (including VARCHAR) to Spanner's STRING type, HarbourBridge is effectively a UTF-8 based tool. Internally, we use Go's string type, which supports UTF-8.

Timestamps and Timezones

There are some subtle differences in how timestamps are handled in SQL Server and Spanner.

During data conversion, SQL Server stores SMALLDATETIME, DATETIME2, DATETIME values in ISO 8601 format without offsets which is consistent with the way Spanner stores it. Hence, they are directly stored in Spanner. For DATETIMEOFFSET, the value contains an offset. During conversion, the timezone offset is taken into account and the corresponding UTC time is saved.

Documentation ¶

Overview ¶

Package sqlserver handles schema and data migrations from sqlserver.

Index ¶

func ConvertData(conv *internal.Conv, tableId string, colIds []string, srcSchema schema.Table, ...) (string, []string, []interface{}, error)
func ProcessDataRow(conv *internal.Conv, tableId string, colIds []string, srcSchema schema.Table, ...)
type InfoSchemaImpl
type ToDdlImpl
- func (tdi ToDdlImpl) ToSpannerType(conv *internal.Conv, spType string, srcType schema.Type) (ddl.Type, []internal.SchemaIssue)

Constants ¶

This section is empty.

Variables ¶

This section is empty.

Functions ¶

func ConvertData ¶

func ConvertData(conv *internal.Conv, tableId string, colIds []string, srcSchema schema.Table, spSchema ddl.CreateTable, vals []string) (string, []string, []interface{}, error)

ConvertData maps the source DB data in vals into Spanner data, based on the Spanner and source DB schemas. Note that since entries in vals may be empty, we also return the list of columns (empty cols are dropped).

func ProcessDataRow ¶

func ProcessDataRow(conv *internal.Conv, tableId string, colIds []string, srcSchema schema.Table, spSchema ddl.CreateTable, vals []string)

ProcessDataRow converts a row of data and writes it out to Spanner. srcTable and srcCols are the source table and columns respectively, and vals contains string data to be converted to appropriate types to send to Spanner. ProcessDataRow is only called in DataMode.

Types ¶

type InfoSchemaImpl ¶

type InfoSchemaImpl struct {
	DbName string
	Db     *sql.DB
}

func (InfoSchemaImpl) GetColumns ¶

func (isi InfoSchemaImpl) GetColumns(conv *internal.Conv, table common.SchemaAndName, constraints map[string][]string, primaryKeys []string) (map[string]schema.Column, []string, error)

GetColumns returns a list of Column objects and names

func (InfoSchemaImpl) GetConstraints ¶

func (isi InfoSchemaImpl) GetConstraints(conv *internal.Conv, table common.SchemaAndName) ([]string, map[string][]string, error)

GetConstraints returns a list of primary keys and by-column map of other constraints. Note: we need to preserve ordinal order of columns in primary key constraints. Note that foreign key constraints are handled in getForeignKeys.

func (InfoSchemaImpl) GetForeignKeys ¶

func (isi InfoSchemaImpl) GetForeignKeys(conv *internal.Conv, table common.SchemaAndName) (foreignKeys []schema.ForeignKey, err error)

GetForeignKeys returns a list of all the foreign key constraints.

func (InfoSchemaImpl) GetIndexes ¶

func (isi InfoSchemaImpl) GetIndexes(conv *internal.Conv, table common.SchemaAndName, colNameIdMap map[string]string) ([]schema.Index, error)

GetIndexes return a list of all indexes for the specified table.

func (InfoSchemaImpl) GetRowCount ¶

func (isi InfoSchemaImpl) GetRowCount(table common.SchemaAndName) (int64, error)

GetRowCount with number of rows in each table.

func (InfoSchemaImpl) GetRowsFromTable ¶

func (isi InfoSchemaImpl) GetRowsFromTable(conv *internal.Conv, tableId string) (interface{}, error)

GetRowsFromTable returns a sql Rows object for a table.

func (InfoSchemaImpl) GetTableName ¶

func (isi InfoSchemaImpl) GetTableName(schema string, tableName string) string

GetTableName returns table name.

func (InfoSchemaImpl) GetTables ¶

func (isi InfoSchemaImpl) GetTables() ([]common.SchemaAndName, error)

GetTables return list of tables in the selected database.

func (InfoSchemaImpl) GetToDdl ¶

func (isi InfoSchemaImpl) GetToDdl() common.ToDdl

GetToDdl function below implement the common.InfoSchema interface.

func (InfoSchemaImpl) ProcessData ¶

func (isi InfoSchemaImpl) ProcessData(conv *internal.Conv, tableId string, srcSchema schema.Table, commonColIds []string, spSchema ddl.CreateTable, additionalAttributes internal.AdditionalDataAttributes) error

ProcessDataRows performs data conversion for source database 'db'. For each table, we extract data using a "SELECT *" query, convert the data to Spanner data (based on the source and Spanner schemas), and write it to Spanner. If we can't get/process data for a table, we skip that table and process the remaining tables.

Note that the database/sql library has a somewhat complex model for returning data from rows.Scan. Scalar values can be returned using the native value used by the underlying driver (by passing *interface{} to rows.Scan), or they can be converted to specific go types. We choose to do all type conversions explicitly ourselves so that we can generate more targeted error messages: hence we pass *interface{} parameters to row.Scan.

func (InfoSchemaImpl) StartChangeDataCapture ¶

func (isi InfoSchemaImpl) StartChangeDataCapture(ctx context.Context, conv *internal.Conv) (map[string]interface{}, error)

We leave the 2 functions below empty to be able to pass this as an infoSchema interface. We don't need these for now.

func (InfoSchemaImpl) StartStreamingMigration ¶

func (isi InfoSchemaImpl) StartStreamingMigration(ctx context.Context, client *sp.Client, conv *internal.Conv, streamingInfo map[string]interface{}) error

type ToDdlImpl ¶

type ToDdlImpl struct {
}

ToDdlImpl sql server specific implementation for ToDdl.

func (ToDdlImpl) ToSpannerType ¶

func (tdi ToDdlImpl) ToSpannerType(conv *internal.Conv, spType string, srcType schema.Type) (ddl.Type, []internal.SchemaIssue)

ToSpannerType maps a scalar source schema type (defined by id and mods) into a Spanner type. This is the core source-to-Spanner type mapping. toSpannerType returns the Spanner type and a list of type conversion issues encountered.

Source Files ¶

View all Source files

?	: This menu
/	: Search site
f or F	: Jump to
y or Y	: Canonical URL