Best practices for optimizing dynamic table performance¶
To optimize your dynamic tables’ performance, you should understand the system, experiment with ideas, and iterate based on results. For example:
Develop ways to improve your data pipeline based on your cost, data lag, and response time needs.
Implement the following actions:
Start with a small, fixed dataset to quickly develop queries.
Test performance with data in motion.
Scale the dataset to verify that it meets your needs.
Adjust your workload based on findings.
Repeat as needed, prioritizing tasks with the greatest performance impact.
Additionally, use downstream lag to manage refresh dependencies between tables efficiently, ensuring that refreshes happen only when necessary.
Note
When queried, dynamic tables perform similarly to regular Snowflake tables. For more information, see Optimizing performance in Snowflake.
Full refresh performance¶
Full refresh dynamic tables perform similarly to CREATE TABLE … AS SELECT (also referred to as CTAS). They can be optimized like any other Snowflake query.
Incremental refresh performance¶
To help achieve optimal incremental refresh performance for your dynamic tables:
Keep changes between refreshes minimal, ideally less than 5% of the total dataset, for both sources and the dynamic table.
Consider the number of micro-partitions modified, not just the row count. The amount of work an incremental refresh must do is proportional to the size of these micro-partitions, not only the rows that changed.
Minimize grouping operations like joins, GROUP BYs, and PARTITION BYs in your query. Break down large Common Table Expressions (CTEs) into smaller parts and create a dynamic table for each. Avoid overwhelming a single dynamic table with excessive aggregations or joins.
Ensure data locality by aligning table changes with query keys (e.g., for joins, GROUP BYs, PARTITION BYs). If your tables aren’t naturally clustered by these keys, consider enabling automatic clustering.