Database Reader

You can use Database Reader to read all the data that exists in a supported database or data warehouse at the time the Striim application containing Database Reader is started. The most common use case for Database Reader is to load this existing data to the target to establish an initial, point-in-time copy of the source data that can serve as a starting point for subsequent continuous updates or syncs. This process is called "initial load" in Striim, and "historical sync" or "initial snapshot" by others. After completing the initial load using Database Reader, you can use a CDC reader (if available) or Incremental Batch Reader to continuously load updates to the source database or data warehouse to the target.

Database reader supports the following sources:

BigQuery
HP NonStop SQL/MX (and SQL/MP via aliases in SQL/MX)
MariaDB (including Amazon RDS for MariaDB, MariaDB Galera Cluster, MariaDB Xpand, and SkySQL)
MySQL (including Amazon Aurora for MySQL, Amazon RDS for MySQL, Azure Database for MySQL, and Google Cloud SQL for MySQL)
Oracle (including Amazon RDS for Oracle)
PostgreSQL (including Amazon Aurora with PostgreSQL compatibility, Amazon RDS for PostgreSQL, Azure Database for PostgreSQL, Azure Database for PostgreSQL - Flexible Server, Google AlloyDB for PostgreSQL, and Google Cloud SQL for PostgreSQL)
Snowflake
SQL Server (including Amazon RDS for SQL Server and Azure SQL Database)
Sybase
Teradata
YugabyteDB

Note

Sources supported by Database Reader are also supported by Incremental Batch Reader.

If the connection to the database is interrupted, the application will halt.

The drivers for BigQuery, Oracle, PostgreSQL, Snowflake, SQL Server, and YugabyteDB are bundled with Striim Platform. For all other databases, when this adapter is deployed to a Striim server, the appropriate JDBC driver must be installed as described in Installing third-party drivers in Striim Platform.

For all databases, when this adapter is deployed to a Forwarding Agent, the appropriate JDBC driver must be installed as described in Installing third-party drivers in the Forwarding Agent.

If you are using Database Reader to perform an initial load before running an application using a CDC reader, enable CDC on the source and perform the setup tasks for the CDC reader before starting the initial load.

Performing initial loads with Database Reader

Fast Snapshot Recovery during initial load

When a Database Reader application with recovery enabled (see Recovering applications) is stopped, halts, or terminates before the initial load has completed, when you restart the application it will resume reading tables that have not been completely written to the target. Database Reader's Restart Behavior on IL Interruption setting gives you three choices for what the writer will do:

Keep target table data (default): The application will resume loading, from the beginning, the source tables that had not yet been fully written to the target before the application was interrupted, appending the data to any target tables with partial data. This will typically result in duplicate rows in the target or the application halting with a duplicate row error, which will require manual effort (described below) to resolve. This is the default because it is the closest to the behavior of Striim 4.x, in which recovery was not supported for Database Reader.
Replace target table (recommended): Striim will drop and re-create any target tables with partial data or no data, then resume loading, from the beginning, the source tables that had not yet been fully written to the target before the application was interrupted. This option requires Create Schema to be True, and the writer must have DROP permission on tables in the target schema.
Truncate target table: Striim will delete all data from target tables with partial data or no data, then resume loading, from the beginning, the source tables that had not yet been fully written to the target before the application was interrupted. The writer must have TRUNCATE permission on tables in the target schema.
Functionally, Truncate target table is similar to Replace target table except it does not require you to give Striim permission to drop tables in the target.

If you create the Database Reader application using a wizard, to change the Restart Behavior on IL Interruption setting you must modify the application in the Flow Designer.

In detail, depending on the initial load status of each source table, after the application is restarted Striim will do the following or, in the case of Keep target table data, you must do the following. You can view the tables' statuses using the command SHOW <namespace>.<source name> ILSUMMARY;.

table status: description	Keep target table data	Replace target table	Truncate target table
WAITING_TO_PROCESS: Database Reader had not read the table before the application was interrupted.	If Create Schema is True, you must manually drop the table before restarting the application. If Create Schema is False, you must manually truncate the target table before restarting the application.	Striim will drop and re-create the target table.	Striim will attempt to truncate the target table. If the table does not exist, the application will halt.
SCHEMACREATION_COMPLETED: Create Schema is True and the table has been created in the target, but Database Reader had not started reading the table's data before the application was interrupted.	You must manually truncate the target table before restarting the application. (Writing data may have begun even though the status had not yet changed to IL_INPROGRESS.)	Striim will drop and re-create the target table	Striim will truncate the empty target table.
IL_INPROGRESS: Database Reader was reading the table when the application was interrupted.	You must manually truncate the target table before restarting the application.	Striim will drop and re-create the target table	Striim will attempt to truncate the target table. If the table does not exist, the application will halt.
IL_COMPLETED: The table was fully written to the target before the application was interrupted.	No action required.	Database Reader will not read the table again.	Database Reader will not read the table again.

Creating a Database Reader application

Creating a Database Reader application using TQL

The following example creates a cache of data retrieved from a MySQL table:

CREATE TYPE RackType(
  rack_id String KEY,
  datacenter_id String,
  rack_aisle java.lang.Integer,
  rack_row java.lang.Integer,
  slot_count java.lang.Integer
);
CREATE CACHE ConfiguredRacks USING DatabaseReader (
  ConnectionURL:'jdbc:mysql://10.1.10.149/datacenter',
  Username:'username',
  Password:'passwd',
  Query: "SELECT rack_id,datacenter_id,rack_aisle,rack_row,slot_count FROM RackList"
)
QUERY (keytomap:'rack_id') OF RackType;
The following example creates a cache of data retrieved from an Oracle table:
CREATE TYPE CustomerType (
  IPAddress  String KEY,
  RouterId  String,
  ConnectionMode  String,
  CustomerId  String,
  CustomerName  String
);
CREATE CACHE Customers USING DatabaseReader (
  Password: 'password',
  Username: 'striim',
  ConnectionURL: 'jdbc:oracle:thin:@node05.example.com:1521:test5',
  Query: 'SELECT ip_address, router_id, connection_mode, customer_id, customer_name FROM customers',
  FetchSize: 1000
)
QUERY (keytomap:'IPAddress') OF CustomerType;

Database Reader programmer's reference

Database Reader properties

property	type	default value	notes
Connection Profile Name	enum		Appears in Flow Designer only when Database Provider Type is BigQuery and Use Connection Profile is True. See Introducing connection profiles.
Connection URL	String		for BigQuery: `jdbc:googlebigquery:AuthScheme=OAuthJWT;OAuthJWTCert=<path to service account key>;InitiateOAuth=GETANDREFRESH;ProjectId=<BigQuery project ID>` or use a connection profile (see Introducing connection profiles) for HP NonStop SQL/MX: `jdbc:t4sqlmx://<IP address>:<port>` or `jdbc:t4sqlmx://<IP address>:<port>/catalog=<catalog name>;schema=<schema name>` for MariaDB: `jdbc:mariadb://<ip address>:<port>/<database name>`; for additional options, see Failover and High availability with MariaDB Connector/J for MariaDB Galera Cluster: specify the IP address and port for each server in the cluster, separated by commas: `jdbc:mariadb://<IP address>:<port>,<IP address>:<port>,...`; optionally, append `/<database name>`; for additional options, see Failover and High availability with MariaDB Connector/J for MySQL: `jdbc:mysql://<ip address>:<port>` for Oracle: `jdbc:oracle:thin:@<hostname>:<port>:<SID>` or `jdbc:oracle:thin:@<hostname>:<port>/<service name>`. For RAC or PDB environments, use the service name, not the SID. For Oracle Cloud DBaaS, use `jdbc:oracle:thin:@//<public IP address>:1521/<PDB name>.<identitydomain>.oraclecloud.internal`. To use Oracle native network encryption, append `?encryption_client=required&encryption_types_client=` followed by a comma-delimited list of the encryption types to support (3DES112, 3DES168, AES128, AES192, AES256, RC4_128, RC4_256, RC4_40, or RC4_56), for example, `?encryption_client=required&encryption_types_client=AES192,AES256`. If one or more source tables contain LONG or LONG RAW columns, append `?useFetchSizeWithLongColumn=true`. If appending more than one element, begin all but the first with `&` rather than `?`. for PostgreSQL, `jdbc:postgresql://<ip address>:<port>/<database name>` for Snowflake: `jdbc:snowflake://<account identifier>.snowflakecomputing.com?db=<database name>` or use a connection profile (see Introducing connection profiles) for SQL Server: `jdbc:sqlserver://<ip address>:<port>;DatabaseName=<database name>` You may use Active Directory authentication with Azure SQL Database (see Supporting Active Directory authentication for Azure) or, when Striim is running in Windows, with SQL Server (see Supporting Active Directory authentication for SQL Server). for Sybase: `jdbc:jtds:sybase:<ip address>:<port>/<database name>` for Teradata: `jdbc:teradata://<ip address>/DBS_PORT=<port>,DATABASE=<database name>` for YugabyteDB, `jdbc:postgresql://<ip address>:<port>/<database name>`
Create Schema	Boolean	False	If set to True, when Database Reader's output stream is the input stream of an Azure Synapse Writer, BigQuery Writer, Database Writer (for MariaDB, MariaDB Xpand, MySQL, Oracle, PostgreSQL, SQL Server, or YugabyteDB only), Databricks Writer, Snowflake Writer, or Spanner Writer target, the schema(s) and tables specified in the Tables property in the writer will be created in the target. For example, if the Tables property in the writer is `sourceschema1.%,targetschema1.%;sourceschema2.%,targetschema2.%` then the schemas targetschema1 and targetschema2 will be created in the target, and all the tables in sourceschema1 and sourceschema2 will be created in the corresponding target schema. Data types will be mapped as per Data type support & mapping for schema conversion & evolution. Single-column PRIMARY KEY constraints will be replicated in the target. DEFAULT, FOREIGN KEY, NOT NULL, composite PRIMARY KEY, and UNIQUE KEY constraints will not be replicated. If a table already exists in the target, Striim will verify that the source and target structure match. If they do not, the application will halt and log a TargetTableSchemaMismatchException. MON output for the target tables will include a schemaCreationStatus of Pending, InProgress, Success, or Failed.
Database Provider Type	String	Default	Controls which icon appears in the Flow Designer and whether Use Connection Profile appears in the web UI.
Excluded Tables	String		Data for any tables specified here will not be returned. For example, if `Tables` uses a wildcard, data from any tables specified here will be omitted. Multiple table names (separated by semicolons) and wildcards may be used exactly as for `Tables`.
Fetch Size	Integer	100	Sets the maximum number of records to be fetched from the database in a single JDBC method execution (see the discussion of fetchsize in the documentation for your JDBC driver).
JAAS Configuration	String		If you Enable Kerberos authentication, use this property to configure it. To authenticate Oracle using a Kerberos principal, specify `java.security.krb5.conf=/<path>/krb5.conf; doNotPrompt=true; principal=oracleclient@<Kerberos realm>`. For example, `java.security.krb5.conf=/etc/krb5.conf; doNotPrompt=true; principal=oracleclient@MYDOMAIN.COM` To authenticate Oracle using a Kerberos credential cache, specify `java.security.krb5.conf=/<path>/krb5.conf; doNotPrompt=true; useTicketCache=true; ticketCache=<fully qualified credential cache file name`. For example, `java.security.krb5.conf=/etc/krb5.conf; doNotPrompt=true; useTicketCache=true; ticketCache=/home/Jdbc/Security/krbcache/krb5cc'` To authenticate PostgreSQL, specify `java.security.krb5.realm=<Kerberos realm>; java.security.krb5.kdc=<Kerberos server IP address or network name>; javax.security.auth.useSubjectCredsOnly=false; java.security.auth.login.config=/<path>/login.conf`. For example, `java.security.krb5.realm=MYDOMAIN.COM; java.security.krb5.kdc=kdc.mydomain.com; javax.security.auth.useSubjectCredsOnly=false; java.security.auth.login.config=/home/Jdbc/Security/login.conf`. You must also append the appropriate application name from the specified '`login.conf` (see Enable Kerberos authentication) to the connection URL. For example, `jdbc:postgresql://<ip address>:<port>/<database name>&jaasApplicationName=<application name>`.
Password	encrypted password		The password for the specified user. See Encrypted passwords.
Parallel Threads	Integer	1	For Fast Snapshot Loading, specify the number of threads Database Reader will use. This value should not be higher than the number of tables to be read. When Database Reader is started, the tables will be distributed among the threads in round-robin fashion, then each thread will read one table at a time. When the Tables property is a list of tables, the tables are read in that order. If the Tables property uses a wildcard, the sequence is determined by the order in which the JDBC driver gives Database Reader the list of tables. Known issue DEV-49013: the maximum number of tables supported by Parallel Threads is 255.
Query	String		SQL statement specifying the data to return. You may query tables, aliases, synonyms, and views. Query is not supported when Create Schema is True, Parallel Threads is greater than 1, or Restart Behavior on IL Interruption is Truncate_target_table or Replace_target_table. When `Query` is specified and `Tables` is not, the WAEvent TableName metadata field value will be QUERY. When both `Query` and `Tables` are specified, the data specified by `Query` will be returned, and the `Tables` setting will be used only to populate the TableName field. If the query includes a synonym containing a period, it must be enclosed in escaped quotes. For example: `select * from \"synonym.name\"` If using a query when the output of a DatabaseReader source is the input of a DatabaseWriter target, specify the target table name as the value of DatabaseReader's Tables field. The query may not contain single-line quotes (of the format `--comment`). It may contain multi-line comments (of the format `/* comment */`).
Quiesce on IL Completion	Boolean	False	With the default value of False, you must stop the application manually after all data has been read. Set to True to automatically quiesce the application after all data has been read (see discussion of QUIESCE in Console commands). When you see on the Apps page that the application is in the Quiescing state, it means that all the data that existed when the query was submitted has been read and that the target(s) are writing it. When you see that the application is in the Quiesced state, you know that all the data has been written to the target(s). At that point, you can undeploy the initial load application and then start another application for continuous replication of new data.Console commands Note Set to True only if all targets in the application support auto-quiesce (see Writers overview).Writers overview
Restart Behavior on IL Interruption	Enum	Keep_target_table_data	See Fast Snapshot Recovery during initial load.
Return DateTime As	String	Joda	Set to `String` to return timestamp values as strings rather than Joda timestamps. The primary purpose of this option is to avoid losing precision when microsecond timestamps are converted to Joda milliseconds. The format of the string is yyyy-mm-dd hh:mm:ss.ffffff. When the source is BigQuery or Snowflake, this setting is ignored and timestamp values are always returned as Java DataTime.
SSL Config	String		If the source is Oracle and it uses SSL, specify the required SSL properties (see the notes on SSL Config in Oracle Reader properties).
Tables	String		Specify the table(s) or view(s) to be read. MySQL, Oracle, and PostgreSQL names are case-sensitive, SQL Server names are not. Specify names as `<dataset name>.<table name>` for BigQuery, `<database name>.<table name>` for MySQL, `<schema name>.<table name>` for Oracle and PostgresQL, `<schema name>.<table name>` for Snowflake, and `<database name>.<schema name>.<table name>` for SQL Server. You may specify multiple tables and views as a list separated by semicolons or with the `%` wildcard. For example, `HR%` would read all the tables whose names start with HR. You may use the `%` wildcard only for tables, not for schemas or databases. The wildcard is allowed only at the end of the string: for example, `mydb.prefix%` is valid, but `mydb.%suffix` is not. If you are using the Query property, specify QUERY as the table name. Modifying this property can interfere with recovery. If recovery is enabled for the application and it is necessary to modify the tables list, export the application (see Apps page), drop it, update the exported TQL with the new values, and import it (see Apps page).
Use Connection Profile	Boolean	False	See Introducing connection profiles. Appears in web UI only when Database Provider Type is BigQuery or Snowflake.
Username	String		Specify the DBMS user name the adapter will use to log in to the server specified in ConnectionURL. For all databases, this user must have SELECT permission or privileges on the tables specified in the Tables property. For Oracle, this user must also have SELECT privileges on DBA_TAB_COLS and ALL_COLL_TYPES.
Vendor Configuration	Striing		When reading from an Oracle PDB database, specify the PDB container name as follows, replacing `<PDB name>` with the name of the container: {"Database" : { "Oracle" : { "CONTAINER" : "<PDB name>"}}}

The output type is WAevent.

Note

To read from tables in both Oracle CDB and PDB databases, you must create two instances of DatabaseReader, one for each.

Sample Database ReaderWAEvent

For the following row:

id  first_name  last_name  phone  street          city          state  zip_code
1   Deborah     Burks      NULL   9273 Thorne AV  Orchard Park  NY     14127

The WAEvent would be similar to:

WAEvent{
  data: [1,"Deborah","Burks",null,"9273 Thorne AV","Orchard Park","NY","14127"]
  metadata: {"TableName":"BikeStores.sales.customers","ColumnCount":8,
    "OperationName":"SELECT","OPERATION_TS":1681412863364}
  userdata: null
  before: null
  dataPresenceBitMap: "fwM="
  beforePresenceBitMap: "AAA="
  typeUUID: {"uuidstring":"01edda2e-77f7-9b21-83c2-8e859085da65"}
};

The operation name for Database Reader WAEvents is always SELECT.

DatabaseReader data type support and correspondence

The information in these tables also applies to Incremental Batch Reader.

For all sources except BigQuery and Snowflake:

JDBC column type	TQL type	notes
Types.ARRAY	java.lang.String
Types.BIGINT	java.lang.Long
Types.BIT	java.lang.Boolean
Types.CHAR	java.lang.String
Types.DATE	org.joda.time.LocalDate
Types.DECIMAL	java.lang.String
Types.DOUBLE	java.lang.Double
Types.FLOAT	java.lang.Double
Types.INTEGER	java.lang.Integer
Types.NUMERIC	java.lang.String
Types.REAL	java.lang.Float
Types.SMALLINT	java.lang.Short
Types.TIMESTAMP	org.joda.time.DateTime
Types.TINYINT	java.lang.Short	For MySQL, if the source tables contain columns of this type, append `?tinyInt1isBit=false` to the connection URL (`jdbc:mysql://<ip address>:<port>/<database name>?tinyInt1isBit=false`).
Types.VARCHARCHAR	java.lang.String
other types	java.lang.String

DatabaseReader can not read Oracle RAW or LONG RAW columns (Oracle Reader can).

For BigQuery:

BigQuery type	TQL type

ARRAY	java.lang.String
BIGNUMERIC	java.math.BigDecimal
BOOL	java.lang.Boolean
BYTES	java.lang.String
DATE	java.time.LocalDate
DATETIME	java.time.LocalDateTime
FLOAT64	java.lang.Double
GEOGRAPHY	java.lang.String
INT64	java.lang.Long
INTERVAL	java.lang.String
JSON	java.lang.String
NUMERIC	java.math.BigDecimal
RANGE	java.lang.String
STRING	java.lang.String
STRUCT	java.lang.String
TIME	java.lang.String
TIMESTAMP	java.time.ZonedDateTime

For Snowflake:

Snowflake type	Aliases	TQL type
ARRAY	-	java.lang.String
BINARY	VARBINARY	java.lang.String
DATE	-	java.time.LocalDate
FLOAT	DOUBLE, DOUBLE PRECISION, FLOAT4, FLOAT8, REAL	java.lang.Double
GEOGRAPHY	-	java.lang.String
GEOMETRY	-	java.lang.String
NUMBER	BIGINT, BYTEINT, DECIMAL, INT, INTEGER, NUMERIC, SMALLINT, TINYIN	java.math.BigDecimal
OBJECT	-	java.lang.String
TIME	-	java.time.String
TIMESTAMP_LTZ	TIMESTAMP WITH LOCAL TIME ZONE, TIMESTAMPLTZ	java.time.ZonedDateTime
TIMESTAMP_NTZ	DATETIME, TIMESTAMP, TIMESTAMP WITHOUT TIME ZONE, TIMESTAMPNTZ	java.time.LocalDateTime
TIMESTAMP_TZ	TIMESTAMP WITH TIME ZONE, TIMESTAMPTZ	java.time.ZonedDateTime
VARCHAR	CHAR VARYING, NCHAR VARYING, NVARCHAR, NVARCHAR2, STRING, TEXT CHAR, CHARACTER, NCHAR	java.lang.String
VARIANT	-	java.lang.String

Database Reader runtime considerations

Do not perform a REPLACE or ALTER ... RECOMPILE on an application or flow containing a Database Reader when recovery is enabled. If it is necessary to modify the application, export the application (see Apps page), drop it, update the exported TQL with the new values, and import it (see Apps page).

Runtime considerations when using Database Reader and Incremental Batch Reader with BigQuery sources

Known issue DEV-44030: having two or more datasets or tables with the same name but different cases is not supported.
To read partitioned tables that have Require WHERE clause to query data` enabled, include the parameter InsertPartitionFilter=true in the connection URL
By default, fractional seconds will be returned with precision of only three digits (milliseconds). To return fractional seconds with precision of six digits (microseconds), include the parameters Other=KeepRawStringDataTypes=TIME;UseStorageAPI=false in the connection URL. Setting UseStorageAPI=false will change the format of the strings returned for ARRAY types from [value1, value2, value3, ...] to ["value1", "value2", "value3", ...].
By default, empty arrays and nulls are returned as null. To return both empty arrays and nulls as [], include the parameter Other=EmptyArraysAsNull=false in the connection URL. Returning nulls as null and empty arrays as [] is not supported.
If you receive an exception when using the Query property with a valid query, try including the parameter QueryPassthrough=true in the connection URL.

In this section:

Striim Platform 5.0 documentation

Database Reader

Note

Performing initial loads with Database Reader

Fast Snapshot Recovery during initial load

Creating a Database Reader application

Creating a Database Reader application using TQL

Database Reader programmer's reference

Database Reader properties

Note

Note

Sample Database ReaderWAEvent

DatabaseReader data type support and correspondence

Database Reader runtime considerations

Runtime considerations when using Database Reader and Incremental Batch Reader with BigQuery sources

Search results