Skip to main content

Striim Platform 5.0 documentation

Spanner Writer

For a hands-on tutorial, see Continuous data replication to Cloud Spanner using Striim on cloud.google.com.

Note

Spanner Writer supports only the GoogleSQL dialect. It does not support the PostgreSQL dialect.

Spanner Writer does not support named schemas. All tables must be in the default schema.

Spanner Writer properties

property

type

default value

notes

Batch Policy

String

eventCount: 1000, Interval: 60s

The batch policy includes eventcount and interval (see Setting output names and rollover / upload policies for syntax). Events are buffered locally on the Striim server and sent as a batch to the target every time either of the specified values is exceeded. When the app is stopped, all remaining events are sent to the target.

With the default setting, data will be written every 60 seconds or sooner if the buffer accumulates 1000 events.

When Preserve Source Transaction Boundaries is False, Primary key updates are sent as a single-event batch. When Preserve Source Transaction Boundaries is True, a batch with many primary key updates may take some extra time to execute as a read row operation must be performed.

When the batch policy event count or interval is reached, or Spanner's "Mutations per commit" limit (see Spanner > Documentation > Resources> Quotas & limits > Limits for creating, reading, updating, and deleting data) is reached, a transaction is created (see Spanner > Documentation > Guides > Transactions overview.

When Preserve Source Transation Boundary is True, you may disable this property with the setting -1 (see Using Preserve Source Transaction Boundary with Spanner Writer).

CDDL Action

String

Process

See Handling schema evolution.

Checkpoint Table

String

CHKPOINT

To support recovery (see Recovering applications, a checkpoint table must be created in each target database using the following DDL:Recovering applications

CREATE TABLE CHKPOINT (
  ID STRING(MAX) NOT NULL,
  SOURCEPOSITION BYTES(MAX)
) PRIMARY KEY (ID);

If necessary you may use a different table name, in which case change the value of this property. All databases must use the same checkpoint table name.

Excluded Tables

String

If Tables uses a wildcard, data from any tables specified here will be omitted. Multiple table names (separated by semicolons) and wildcards may be used exactly as for Tables.

Ignorable Exception Code

String

By default, if the target DBMS returns an error, Striim terminates the application. Use this property to specify one or more error codes (see Cloud Spanner > Documentation > Reference > Code) to ignore, separated by semicolons, for example, NOT_FOUND;ALREADY_EXISTS. (You may also specify error numbers from legacy documentation.)

Ignored exceptions will be written to the application's exception store (see CREATE EXCEPTIONSTORE).

Instance ID

String

Specify the instance ID for the databases containing the tables to be written to. (Note: the instance ID may not be the same as the instance name.)

Parallel Threads

Integer

See Creating multiple writer instances (parallel threads).

Preserve Source Transaction Boundary

Boolean

False

See Using Preserve Source Transaction Boundary with Spanner Writer.

Private Service Connect Endpoint

String

If using Private Service Connect with Google Spanner, specify the endpoint created in the target Virtual Private Cloud, as described in Private Service Connect support for Google cloud adapters.

Project ID

String

To use a service account key other than the one associated with the Spanner instance's project, specify its project ID here. Otherwise leave blank.

Service Account Key

String

The path (from root or the Striim program directory) and file name to the .json credentials file downloaded from Google (see Service Accounts). This file must be copied to the same location on each Striim server that will run this adapter, or to a network location accessible by all servers. The associated service account must have the Cloud Spanner Database User or higher role for the instance (see Cloud Spanner Roles).

To use a service account key other than the one associated with the Spanner instance's project, specify a value for the Project ID property.

Tables

String

Specify the name(s) of the table(s) to write to, in the format <database>.<table>. When Filter Transaction Boundaries is True, tables must all be in the same database. When Filter Transaction Boundaries is False, they may be in multiple databases.

If the source is Database Reader and its Create Schema property is True, databases corresponding to the specified schemas will be created in Spanner’s default schema and tables matching those in the source schemas will be created in those Spanner databases (see the discussion of Create Schema in Database Reader properties for more details). Otherwise, the table(s) must exist in the target when the application is started, and if a specified target table does not exist, the application will terminate with an error. To skip writes to missing tables without terminating, specify NOT_FOUND as an Ignorable Exception Code.

To minimize latency, create a separate Spanner Writer for each database to be written to rather than writing to multiple databases with a single Spanner Writer.

The target table name(s) specified here must match the case shown in the Spanner UI. See Naming conventions.

When the target's input stream is a user-defined event, specify a single table.

When the input stream of the target is the output of a DatabaseReader, IncrementalBatchReader, or SQL CDC source (that is, when replicating data from one database to another), it can write to multiple tables. In this case, specify the names of both the source and target tables. You may use the % wildcard only for tables, not for schemas or databases. If the reader uses three-part names, you must use them here as well. Note that Oracle CDB/PDB source table names must be specified in two parts when the source is Database Reader or Incremental Batch reader (schema.%,schema.%) but in three parts when the source is Oracle Reader or OJet ((database.schema.%,schema.%). Note that SQL Server source table names must be specified in three parts when the source is Database Reader or Incremental Batch Reader (database.schema.%,schema.%) but in two parts when the source is MS SQL Reader or MS Jet (schema.%,schema.%). Examples:

source.emp,target.emp
source.db1,target.db1;source.db2,target.db2
source.%,target.%
source.mydatabase.emp%,target.mydb.%
source1.%,target1.%;source2.%,target2.%

When a target table has a commit timestamp column, by default its value will be Spanner's current system time when the transaction is committed. To use a different value, use ColumnMap. For example, to use the time the source transaction was committed in Oracle: ORADB1.%,spandb1.% ColumnMap (Ts @metadata(DBCommitTimestamp))

See Mapping columns for additional options.

Using Preserve Source Transaction Boundary with Spanner Writer

When Preserve Source Transaction Boundary has its default value of False, transactions in the Spanner target (see Spanner > Documentation > Guides > Transactions overview) may contain operations from multiple transactions in the source database. Set Preserve Source Transaction Boundary to True to ensure that each transaction in the Spanner target will contain operations from only a single source database transaction or two or more complete source database transactions.

Note

To use Preserve Source Transaction Boundary, Filter Transaction Boundaries must be set to False in the source.

Preserve Source Transaction Boundary may be used when the source is:

  • GG Trail Reader

  • MariaDB Reader

  • MariaDB Xpand Reader

  • MSJet

  • MS SQL Reader with Transaction Support set to True

  • MySQL Reader

  • OJet

  • Oracle Reader

  • PostgreSQL Reader

When Preserve Source Transaction Boundary is True, a single source transaction will result in multiple transactions in the Spanner target in the following circumstances.

  • When the Batch Policy Event Count or Interval is reached, the target transaction is committed and a new target transaction is begun.

  • When a DDL event is received from the source, the target transaction is committed, the DDL is committed as a separate transaction, and a new transaction is begun for the remaining operations from the source transaction.

  • When the target transaction reaches Spanner's "Mutations per commit" limit, the target transaction is committed and a new target transaction is begun. See Spanner > Documentation > Resources> Quotas & limits > Limits for creating, reading, updating, and deleting data for the current limit (as of August 2024 it was 80,000) and "How can I estimate mutation counts?" in Cloud Spanner doubles the number of updates per transaction (note that this older blog post refers to a previous limit of 40,000). Google may allow you to increase this limit; if you do, Contact Striim support to have your Striim environment adjusted accordingly.

Batch Policy interacts with Preserve Source Transaction Boundary as follows:

TQL

results

Batch Policy is disabled.

Sample TQL: BatchPolicy: ‘-1', PreserveSourceTransactionBoundary: 'true'

Target transactions match source transactions (unless mutations per commit limit is exceeded or a DDL event is encountered).

Batch Policy has Event Count and Interval, and no single transaction exceeds the Event Count or Interval.

Sample TQL: BatchPolicy: ‘eventCount: 1000, Interval: 60s', PreserveSourceTransactionBoundary: 'true'

Target transactions contain one or more complete source transactions (unless mutations per commit limit is exceeded or a DDL event is encountered).

Batch Policy has Event Count and Interval, and a single source transaction exceeds one or both.

Sample TQL: BatchPolicy: ‘eventCount: 1000, Interval: 60s', PreserveSourceTransactionBoundary: 'true'

Batch policy is ignored; target transactions match source transactions (unless mutations per commit limit is exceeded or a DDL event is encountered).

Batch Policy has only Event Count.

Sample TQL: BatchPolicy: ‘eventCount: 1000', PreserveSourceTransactionBoundary: 'true'

Not supported.

Batch Policy has only Interval.

Sample TQL: BatchPolicy: ‘Interval: 60s', PreserveSourceTransactionBoundary: 'true'

Not supported.

Spanner Writer sample application

The following sample application will copy all tables from two Oracle source schemas to tables with the same names in two Spanner databases. All source and target tables must exist before the application is started.

CREATE SOURCE OracleSource1 USING OracleReader (
  Username:'myname',
  Password:'******',
  ConnectionURL: 'localhost:1521:XE’,
  Tables:'MYDB1.%;MYDB2.%’
) 
OUTPUT TO sourceStream;

CREATE TARGET SpannerWriterTest USING SpannerWriter(
  Tables:'ORADB1.%,spandb1.%;ORADB2.%,spandb2.%',
  ServiceAccountKey: '<path>/<filename>.json',
  instanceId: 'myinstance'
)
INPUT FROM sourceStream;

Spanner Writer data type support and correspondence

See also Data type support & mapping for schema conversion & evolution.

TQL type

Spanner type

notes

Boolean

BOOL

byte[]

BYTES

DateTime

DATE, TIMESTAMP

Double, Float

FLOAT64, NUMERIC

Integer, Long

INT64

String

STRING

  • maximum permitted length in Spanner is 2,621,440

  • If a String represents a timestamp value, use one of the TO_ZONEDDATETIME functions (see Date functions) to convert it to java.time.ZonedDateTime.

When the input of a SpannerWriter target is the output of an Oracle source (DatabaseReader, IncremenatlBatchReader, or MySQLReader):

Oracle type

Spanner type

notes

BINARY_DOUBLE

FLOAT64, NUMERIC

BINARY

FLOAT64, NUMERIC

BLOB

BYTES

CHAR

STRING

maximum permitted length in Spanner is 2,621,440

CLOB

DATE

DATE

FLOAT

FLOAT64, NUMERIC

LONG

STRING

maximum permitted length in Spanner is 2,621,440

NCHAR

STRING

maximum permitted length in Spanner is 2,621,440

NCLOB

STRING

maximum permitted length in Spanner is 2,621,440

NVARCHAR2

STRING

maximum permitted length in Spanner is 2,621,440

NUMBER

INT64

NUMBER(precision,scale)

FLOAT64, NUMERIC

RAW

BYTES

ROWID

STRING

maximum permitted length in Spanner is 2,621,440

TIMESTAMP

TIMESTAMP

TIMESTAMP

TIMESTAMP WITH LOCAL TIMEZONE

TIMESTAMP

TIMESTAMP

TIMESTAMP WITH TIMEZONE

TIMESTAMP

TIMESTAMP

UROWID

STRING

maximum permitted length in Spanner is 2,621,440

VARCHAR2

STRING

maximum permitted length in Spanner is 2,621,440

XMLTYPE

When the input of a SpannerWriter target is the output of a SQL Server source (DatabaseReader, IncremenatlBatchReader, or MySQLReader):

SQL Server type

Spanner type

bigint

INT64

binary

BYTES

bit

not supported

char

STRING

maximum permitted length in Spanner is 2,621,440

date

DATE

datetime

TIMESTAMP

datetime2

TIMESTAMP

datetimeoffset

TIMESTAMP

decimal

FLOAT64, NUMERIC

float

FLOAT64, NUMERIC

image

BYTES

int

INT64

money

in Striim 4.2.0.4 or later: NUMERIC

nchar

STRING

ntext

STRING

numeric

in Striim 4.2.0.4 or later: NUMERIC

nvarchar

STRING

nvarchar(max)

not supported

real

not supported

smalldatetime

TIMESTAMP

smallint

INT64

smallmoney

in Striim 4.2.0.4 or later: NUMERIC

text

STRING

maximum permitted length in Spanner is 2,621,440

time

STRING

tinyint

INT64

uniqueidentifier

not supported

varbinary

BYTES

varchar

STRING

maximum permitted length in Spanner is 2,621,440

xml

STRING

maximum permitted length in Spanner is 2,621,440