Data modeling best practices
A well-defined semantic layer, including metadata and context, can improve the accuracy of results from Sigma Assistant and the relevance of selected data sources and responses.
Follow these best practices for data models that you configure as data sources for Sigma Assistant or that you intend to query with the Sigma MCP server:
- Build semantic layer context with relationships and metrics
- Provide semantic descriptions for relevance
- Provide explicit context
Build semantic layer context
Sigma Assistant uses the context built into your data model to produce more accurate results:
-
If you define relationships in your data model, Sigma Assistant can join data to answer complex questions about your data.
-
If you create metrics in your data model, Sigma Assistant uses them to perform aggregate calculations. When you set up a metric, do the following:
- Set a default timeline for metrics to provide context about the relevant time dimension for the metric.
- Add a description to provide context about what the metric calculates.
Provide semantic descriptions for relevance
Sigma Assistant uses a semantic search index to identify the most relevant source to use to answer a question, choosing from those configured by your admin.
If no metadata is configured for your data model, the search index uses column names and generates descriptions and synonyms to inform source selection. If your data model includes the following metadata, the metadata is indexed and further helps Assistant identify the most relevant source:
- Column descriptions
- Table descriptions
- Data model description
In addition to descriptions, you can add a certification badge to indicate the status, quality, and reliability of the data.
Add descriptions to columns, tables, and data models
Descriptions can be inherited from your data platform, imported from dbt, or set in Sigma:
- Add metadata to a data model, including badges.
- Annotate tables with metadata in the data catalog, including badges.
- Use the Update a data model from a code representation API endpoint to add metadata to a data model.
Write high-quality descriptions
High-quality descriptions are written for both people and machines and provide relevant, specific information like the following:
- The grain of the data
- For a calculated column description, the logic of how the column is calculated
- How to interpret the values in the column
- Assumptions in the data, such as currency type.
Provide explicit context to AI
Descriptions provide information to humans and agents interacting with your data model, while AI context provides instructions to Sigma Assistant about how to use the data in the data model. For more details and examples of what to include in the AI-specific context of a data model, see Manage AI context for data models.
Updated 2 days ago
