Skip to main content

Striim Platform 5.0 documentation

Snowflake Reader

For an introduction to reading from Snowflake, see Snowflake.

Snowflake is a cloud-based data warehousing and analytics platform. It allows users to store, analyze, and query large datasets in a fast and scalable way using SQL-like queries.

Snowflake Reader reads from Snowflake using change data capture, and provides real-time continuous replication after initial load. Snowflake Reader can read data manipulation language (DML)  and limited DDL changes made to tables, such as ALTER TABLE ADD Column and ALTER TABLE DROP Column. Snowflake Reader identifies the changes made to each table by periodically querying the table using a CHANGES clause, with incrementing time intervals for each subsequent poll.

Typical use cases for reading from Snowflake include:

  • Reverse ETL-based use cases (for example, updating predicted LTV or churn scores on customer profiles) to read from Snowflake Analytics results and write to operational systems (such as CRM, SCM, or OLTP databases).

  • Consolidation of data warehouses from departmental instances to a corporate instance.

You can read from Snowflake as follows, and write to any target supported by Striim:

  • For initial load, you can use Database Reader.

  • For real-time continuous replication after initial load using CDC, you can use Snowflake Reader.

  • If you choose not to enable CDC on Snowflake, you can use Incremental Batch Reader to read the new source data at regular intervals, allowing for updates in near real-time.  Incremental Batch Reader differs from the CDC readers in several ways, including not capturing DELETE operations at the source (see the discussion in Incremental Batch Reader about INSERT, UPDATE, and DELETE operations).

Snowflake Reader summary

APIs used/data supported

Snowflake Reader uses the Snowflake JDBC driver snowflake-jdbc-3.17.0.jar, which is bundled with Striim.

Data supported

Tables (Views are not supported)

Supported targets

All targets that support WAEvent input. This includes all database, data warehouse, and file system targets.

Security and authentication

  • Key pair

  • OAuth 2.0

  • Username and password

Operations / modes supported

Data can be read from the previous 24 hours (or longer if Snowflake's Time Travel default retention period is increased).

Captures inserts, updates and deletes.

Schema management

Snowflake Reader does not support initial schema creation at the target. Use Database Reader to create the initial schema.

Snowflake Reader supports schema evolution for selected schema changes. See Handling schema evolution.

Resilience / recovery

Configurable automatic retries.

Supports recovery with at-least once processing. See Recovering applications.

Performance

The SnowflakeConfiguration property group is used to specify the ThreadPoolSize, EnableQuickCatchUp, and YieldAfter, properties that allow you to tune the performance of the Snowflake Reader and the load it places on Snowflake. The following considerations apply to configuring ThreadPoolSize:

  • ThreadPoolSize < Table Count – the average read latency will be relatively higher, the overall load on Snowflake will be lower.

  • ThreadPoolSize = Table Count – Snowflake reader achieves the optimal latency. This configuration can overload source Snowflake when the number of tables I high leading to performance bottlenecks.

  • ThreadPoolSize > Table Count – not recommended, and may result in unused resources in Snowflake Reader.

For more information, see Snowflake Reader runtime considerations.

Programmability

  • Striim TQL

  • Striim Flow Designer

  • Wizards

Metrics and auditing

Key metrics available through Striim monitoring

Key considerations and limitations

  • To query the change data from Snowflake using SnowflakeReader, change tracking must be enabled on the source table.

  • Private Link Support – Striim supports both AWS and Azure Private link for Secure Connectivity. Note: Business Critical or higher edition is required for enabling Private Link in Snowflake.

  • SnowflakeReader does not currently support external tables.

  • Change data availability limitation: Change data is only available for the duration of the Time Travel retention period (default is 1 day). If an application requests change data that is beyond the retention period, it will halt.

Data coverage

Snowflake CDC Reader supports all the Snowflake data types (see Summary of data types in the Snowflake doc) except for the Vector type.