Build a Sankey Diagram (Beta)

Sankey diagrams are currently in Beta and subject to quick, iterative changes. As a result, the latest version may differ from the content of this document.

Sankey diagrams are typically used to assess the flow and change of data between stages in a process or system. With Sigma, you can create basic Sankey diagrams to demonstrate data distribution, workflows, networks, etc. You can also build advanced multi-level charts to analyze more complex data relationships and identify substantial changes in metric values across various stages, categories, or periods.

This document details basic Sankey diagram requirements and introduces key properties and format options to help you enhance your workbook visualizations.

Example use cases:

  • Energy analytics - measure electricity load and consumption to understand facility performance and gain insight into the origins and transformation of energy.
  • Financial analytics - track annual spend by department, division, and expense category to understand the flow of money and analyze budget vs. spend distribution.
  • Marketing analytics - follow website visitor activity by parent domain and subsequent page visits to understand user navigation and assess website architecture deficiencies.

Summary of Content

User Requirements

Workbook Prerequisite

Basic Sankey Diagram Requirements

Select the Visualization Type

Define the Stages and Categories

Define the Metric

All Sankey Diagram Format Options

Related Resources


User Requirements

To create and save edits to workbook visualizations, you must be the workbook owner or be granted Can Edit access.

Users with Can Explore access to the workbook can modify visualization properties and formatting but cannot save changes. 


Workbook Prerequisite

Before you can build a Sankey diagram, you must add a new visualization element and select a data source.

At the core of every visualization is an underlying data table (derived from the data source) that supplies the information visualized by the chart. As you build a Sankey diagram, Sigma automatically groups, aggregates, and calculates the underlying data to create source columns for various visualization properties. You can view the underlying data table while configuring the chart to see how the data is applied. 

Sankey diagrams support up to 25,000 data points. If the configurations result in a data set that exceeds this limit, the chart displays the first 25,000 data points, and a warning message indicates that the chart is incomplete. To reduce the number of data points, aggregate the values or apply data filters to the visualization or source element.


Basic Sankey Diagram Requirements

To create a Sankey diagram, you must configure the following properties in the Element properties panel:

  • Visualization - chart type displayed in the workbook
  • Stages - source columns that define the stages and categories  
  • Value - source column that defines the data path metric

In a Sankey diagram, stages consist of categories presented as individual rectangular nodes. Data paths connect the categories across stages to illustrate the flow of data. Path values represent a metric (e.g., energy consumption, expenses, page visitors), which measures the quantity of data flowing between categories and determines the width of each path.

Select the Visualization Type

Once you add a new visualization to a workbook, select the visualization type:

  • In the Visualization property, click the dropdown field and select Sankey from the list.

You can also use this dropdown field to convert an existing visualization to a different type. Sigma retains all property and format configurations shared by the initial and new type. Unshared properties and formatting are not saved or restored if you further convert the visualization.

Define the Stages and Categories

Configure source columns to define the stages and categories.

  1. In the Stage property, click Add column and select an option from the menu:
    • To generate stage categories based on distinct values in an existing column, search or scroll the Select column list and select the preferred column name.
    • To generate stage categories based on a custom formula, select New column and enter the formula in the toolbar.

    You can also select or replace an existing column by dragging and dropping a column name from the Columns list to the Stage property.

  2. [optional] Control how the source column data is categorized and displayed in the chart:
    1. To open the column menu, click the caret () to the right of the source column name.
    2. Hover over any of the following items, then select the preferred option:
      • Truncate date - Categorize date values by the selected interval or unit of measure.
      • Transform - Convert the column to the selected data value type.
      • Format - Display data labels in the selected format.

    Availability of column menu items and corresponding options varies depending on the column’s data value type (e.g., Truncate date is available for date values only).

  3. Repeat the previous steps to configure additional stages (a minimum of two stages are required).

    Sigma charts the stages (as start and end points) in order of precedence, from top to bottom. Drag and drop source column names in the Stage property to reorder them as needed.

Define the Metric

Configure a source column to define the metric. Sigma automatically aggregates column values associated with the initial stage categories to measure the data flow starting points. Within each of these categories, Sigma aggregates values associated with the subsequent stage categories, then plots these measures as data paths to the end points.

  1. In the Value property, click Add calculation and select an option from the menu:
    • To aggregate values of an existing column, search or scroll the Aggregate column list and select the preferred column name.
    • To calculate values based on a custom formula, select New column and enter a formula in the toolbar. 
    • To count the number of rows associated with each stage name, select Row count.

    You can also select an existing column by dragging and dropping a column name from the Columns list to the Value property.

  2. [optional] Control how the source column data is calculated and displayed in the chart:
    1. To open the column menu, click the caret () to the right of the source column name.
    2. Hover over any of the following items and select the preferred option:
      • Set aggregate - Calculate values based on the selected aggregation method.
      • Transform - Convert the column to the selected data value type.
      • Format - Display data labels in the selected format.
    • You can also use the toolbar to change the aggregation method (via the formula) and data label format.
    • If the configurations result in an incomplete chart that exceeds the 25,000 data point limit, apply data filters to reduce the number of data points.
  3. [optional] Sigma auto-generates source column names and chart titles to reflect the visualized data, but you can customize these fields as needed:
    • To rename a source column, double-click the column name in the Stage or Value property, then enter a new name. Changes are reflected in the default chart title.
    • To edit the chart title, double-click the title in the visualization, then enter a new title.

    Sigma auto-generates the default chart title only. Once the title is customized, it no longer reflects changes to source columns and their names. To learn more about title customization, see Format Chart Title.

  4. [optional] In the Element Properties > Marks > Color section, select or customize a color palette to apply to the category nodes and paths.

All Sankey Diagram Format Options


Related Resources