Materialization (Beta)
This documentation describes a public beta feature and is under construction. This documentation should not be considered part of our published documentation until this notice, and the corresponding Beta flag on the feature in Sigma, are removed. As with any beta feature, the feature discussed below is subject to quick, iterative changes. The latest experience in the Sigma service may differ from the contents of this document.
Beta features are subject to the disclaimer on Beta features.
Materializations improve the speed and performance of your reports. You can configure and schedule materializations both for datasets and workbooks.
Requirements
- Write access must be enabled on your datasetβs connection.
- You must be an organization Admin or be assigned a custom account type that has materialize permissions along with the permissions to edit datasets and/or workbooks.
Introduction
Materializations allows you to write datasets and workbook elements back to your warehouse as tables which can reduce compute costs. Materialization enhances query performance by allowing your data warehouse to avoid recomputing the dataset when it's used by an element or downstream elements.
Materializations are stored in your warehouse and saved in scratch workspace schema automatically managed by Sigma. Sigma's query compiler automatically and transparently uses the latest materialization.
Schedule materialization
A materialization is created through the act of scheduling a materialization. The materialization schedule can effect the data freshness for datasets and all downstream documents. Long running queries that do not display in Sigma can still be materialized.
Create a dataset materialization schedule
-
Open the dataset you would like to materialize.
-
Click the Materialization tab.
-
Click Create Schedule to open the Materialization Schedule modal.
-
Set the schedule for when you would like your dataset to materialize.
-
Click Save
Edit a materialization schedule
-
Open the dataset.
-
Click the Materialization Tab.
-
Click the β’β’β’ icon next to Run Now.
-
Click Edit Schedule to open the Materialization Schedule.
-
Select a new schedule.
-
Click Save.
Run a scheduled materialization
Follow the instructions below to manually run a scheduled materialization.
- Open the dataset.
- Go to your datasetβs Materialization tab.
- Click the Run Now button in the schedule header.
Delete a materialization schedule (and materialized tables)
When a materialization schedule is deleted, Sigma will automatically remove the materialized table from your warehouse. It can take up to 24 hours for the table to be deleted.
- Open the dataset.
- Click the Edit button in the top right corner of your screen.
- Go to your datasetβs Materialization tab.
- Click the β’β’β’ icon next to the Run Now button in the schedule header.
- Click Remove Schedule.
Pause scheduled materializations
By default, Sigma pauses scheduled materializations that have not been used for 60 days.
To change this behavior, follow these steps:
-
Navigate to the Materializations section in the Administration panel.
-
At the top of the Materialization page, change your options to reflect your business requirements:
- Never: Select this option if you never want to stop materializations.
- If materialized table...: Retain this option if you want to pause materializations that are not in use. The default setting is 60 days and you can adjust it as necessary.
Create materializations in workbooks
Β Workbook materialization is similar to datasets, with the following differences:
- It lets you materialize any workbook element (table, visualization, pivot)Β that can be used as a data source for another element.
- Materializations are used when viewing and editing a workbook. Materializations are not used when editing a dataset.
- Workbook materialization offers a smarter refresh when published. Publishing a document only refreshes the materialization if necessary. For example, if the materialized element hasn't changed, materialization doesn't run. Datasets always refresh the materialization on publish.
- It offers support for different levels of reference. Grouping levels can be independently materialized. Datasets only materialize at one grouping level.
- It pushes status changes. The workbook viewer is notified of Materialization activity in a new "alerts" center (to be integrated with in-app notifications) and toasts.
Follow the instructions below to materialize a workbook element.
-
Select the element menu from the top-right corner of the element you want to materialize.
-
Select Schedule materialization.
-
In the Materializations Schedules section, set a schedule to materialize data elements.
A materialization can only have one schedule, but a schedule can have multiple materializations.
-
If an element contains multiple grouping levels, select a grouping level to materialize. To materialize multiple grouping levels, select Add Element and add the element for each additional grouping level.
It's often unnecessary and potentially costly to materialize the most granular level of an element (indicated in the UI as All source columns).
-
Click Save Schedules.
When you publish edits to a workbook, the materialization is initiated.
-
To verify the success of your materialization, you will see a check marks on the elementβs menu.
The lineage view also indicates that a materialization exists for the element and is available for query optimizations.
View the complete materialization list
A complete list of your organization's materialization schedules can be found in your Admin Portal.
-
Go to the Admin Portal.
-
Click Materializations in the left panel to open the Materialization page.
-
You can search Materializations and click the Document name to open the workbook.
In the Status column, you can verify if a materialization was successful.
Last Refresh
View materialization logs and statistics
Logs and statistics for each of your organization's materialization runs can be found in the Usage Dashboard within the Admin Portal.
- Go to the Admin Portal.
- Click Usage in the left panel and then select the Materialization tab.
- From here you can browse statistics about all your organization's materialization runs as well as logs for each run in the table calledΒ Materialization Log.
View materialized tables in your warehouse
To schedule materializations, you must enable write access on your connection. All materialized tables automatically write to the specified schema.
These tables are visible in your cloud data warehouse. However, they're not intended to be used as source tables for other applications. To access this data from other applications, see Sigmaβs Dataset Warehouse Views feature.
Don't make changes to the materialized tables directly in your database. This can result in unexpected results or query failures.
View materialized tables in Sigma
If materialization exists on a dataset, the materialization status will appear on the dataset header information popup. To view these details:
- Click on the information icon in the dataset header. This opens the dataset information popup.
- To view additional details, hover over the materialization status.
Admins can view the status and history of materializations from a datasetβs Materialization tab at any time.
Materialized tables aren't displayed in Sigmaβs connection browser. To use a table for further analysis, build that analysis on top of the associated dataset.
Best practices
Order of materialization and refresh
Imagine that Dataset A is materialized, and Dataset B is built on top of Dataset A.
When you materialize Dataset B, you re-run all of the code in Dataset A as part of this materialization. You do not materialize Dataset B from the materialized Dataset A.
You should always complete refresh Dataset A before refreshing Dataset B. Keep this in mind when creating the overall materialization schedule.
While building datasets with greater complexity over earlier datasets is a better building and conceptualizing approach, it is arguable less efficient than building one very large dataset, from the computational perspective.
Disabled materializations
A datasetβs materialization schedule can become disabled if the dataset or any underlying Datasets change so they no longer support materialization. You may get a "materialization disabled" error.
- Become familiar with materialization limitations.
- To explore upstream sources, use the datasetβs Lineage tab.
- To explore joined sources, use the datasetβs Links tab.
Failing materializations
Materialization can fail for a number of reasons. Admin users can view logs for each materialization run in the Usage page, which showsΒ the explicit error messages.
Check the dataset for new column errors or materialization limitations. Your datasetβs Lineage and Links may also be the source of failed materializations.
When a materialization fails, the dataset continues to use the last successful materialization; see Usage.
Limitations
- Materialization isn't available for datasets that use parameters or system functions. These datasets are expected to return different values when their parameters change. Materialized tables, on the other hand, always return the same value - the fixed output of the dataset at the time the materialization was run. As a result, using the materialized versions of datasets that use parameters or system functions would produce unexpected results.
- Materialization is incompatible with row-level security. The materialization will error if row-level security (user attributes) functions are referenced.
- Datasets referencing other datasets, by means of duplication or joins, can typically be materialized. However, this isn't true if any underlying dataset(s) cannot be materialized.
Updated 5 days ago