Connect to Databricks

Sigma supports secure connections to Databricks.

This document explains how to connect your organization to a Databricks warehouse.

📘

For information about Sigma feature compatibility with Databricks connections, see Region, warehouse, and feature support.

Requirements

In your Sigma organization:

  • You must be assigned the Admin account type or an account type with the Manage connections feature permission enabled.

In Databricks:

  • You must have access to a Databricks workspace with Databricks SQL access enabled. See Manage entitlements in the Databricks documentation.
  • You must have access to a Databricks SQL warehouse or have the ability to create one by either being an Admin or having the Allow unrestricted cluster creation user entitlement. See Requirements in the Databricks documentation.
  • You must be able to either use your own personal access token (PAT) or one attached to a user or service principal who has permissions. See Monitor and manage access to personal access tokens in the Databricks documentation.

Configure Databricks

Complete the following steps in Databricks before you add a Databricks connection to Sigma.

  1. Create a Databricks SQL warehouse if one doesn't already exist. See Create a SQL warehouse in the Databricks documentation.

  2. Confirm that the user or service principal you plan to use to connect to this SQL warehouse has Can use or Can manage permissions for the compute resource, and that all workspace users have Can use permissions.

  3. Configure your Auto stop setting. For information on this setting, see Configure SQL warehouse settings in the Databricks documentation.

    • If you are running a Serverless SQL warehouse, Sigma recommends that you enable Auto stop and setting it to 10 or 15 minutes.
    • If you are running a Pro or Classic SQL warehouse, disable Auto stop on your Databricks endpoint so that your first query does not time out or run slowly when the SQL endpoint is in a suspended state.
  4. Configure data access to the SQL warehouse. In order to query data using the Databricks SQL warehouse, the user, group, or service principal that you use to connect Databricks to Sigma needs underlying access to the data. For instructions on how to set these permissions in Unity Catalog, see Manage privileges in Unity Catalog in the Databricks documentation.

    • At the catalog level, grant all account users USE CATALOG and USE SCHEMA privileges.

    • At the schema level, grant all account users BROWSE, EXECUTE, READ VOLUME, and SELECT privileges.

    • If you plan to enable write-access features on this connection, also grant all account users MODIFY and CREATE TABLE privileges at the schema level on the dedicated catalogs and schemas you plan to define for write access.

      For details on the privileges required for write access, see Unity Catalog privileges and securable objects in the Databricks documentation.

    📘

    If you are using the legacy Hive metastore to manage permissions, the permissions model is different. To set up equivalent privileges with the legacy Hive metastore, see Hive metastore privileges and securable objects (legacy) in the Databricks documentation. If you want to sync data from your hive_metastore catalog, the tables in that catalog require READ_METADATA privileges.

  5. Obtain the Server hostname and HTTP path from your SQL warehouse’s Connection details screen. You need these values in the next step when you configure the Databricks connection in Sigma.

  6. Create an access token for the user or service principal to use to connect to this SQL warehouse. The type of token you create depends on the authentication method you use when configuring the Databricks connection in Sigma. For token creation instructions, see Authentication for Databricks tools and APIs in the Databricks documentation.

Considerations when connecting Sigma to Databricks

When you connect Sigma to Databricks, choose the most relevant authentication method for your use case:

Create a Databricks connection in Sigma

To create a Databricks connection, perform the following steps in Sigma:

Add a connection and specify connection details

  1. Click the user icon at the top right of your screen. The user icon is usually composed of your initials.

  2. In the drop-down menu, select Add connection. The Add new connection page appears.

  3. In the Connection details section, specify the following:

    NameEnter a Name for the new connection. Sigma displays this name in the connection list.
    TypeSelect Databricks.

Specify your connection credentials

In the Connection credentials section, fill out the required fields:

  1. In the Host field, enter the value of the Server hostname field in the Connection details screen of your SQL warehouse.

  2. In the HTTP path field, enter the value of the HTTP path field in the Connection details screen of your SQL warehouse.

  3. Click the down arrow () next to Authentication, then choose your authentication method.

    • If you want to authenticate your connection with OAuth, select OAuth.
    • Otherwise, select Basic Auth, then generate a token in Databricks to authenticate the Sigma connection. For instructions, see Databricks personal access tokens for service principals in the Databricks documentation.

Next, see Configure OAuth features if you are using OAuth to authenticate your connection. If you are not using OAuth, skip to Configure write access and Configure connection features for additional options. Or, if you are finished configuring your connection, click Create at the top right to create your connection.

Configure OAuth features

If you selected OAuth as your authentication method for the connection, see Connect to Databricks with OAuth for the complete steps.

Configure write access

Write access is necessary for the following features:

The steps to configure write access differ depending on whether you are using OAuth or basic authentication for the connection. Follow the instructions that match your authentication option:

📘

When you designate a schema as the write access destination, Sigma reserves it for internal write-back objects and doesn’t expose it as a data source in the connection explorer (data catalog). To avoid restricting user access to analytical data, use a dedicated write-back database or schema that doesn’t store data used for analysis and reports.

Configure write access on a connection with basic authentication

Configuring write access requires you to set up dedicated catalogs and schemas in Databricks that Sigma can use to write data and grant MODIFY and CREATE TABLE privileges on those schemas to the service account.

  1. Turn on the Enable write access toggle.

  2. Configure the following fields:

    1. In the Write catalog field, enter the name of the catalog where Sigma must store write-back data.
    2. In the Write schema field, enter the schema where Sigma must store write-back data.

Configure write access on a connection with OAuth

Configuring write access requires setting up dedicated catalogs and schemas in Databricks granting the necessary permissions. For information about how write access works for OAuth connections, see About OAuth with write access.

  1. Turn on the Enable write access toggle.
  2. For Write destinations, provide at least one path in the format CATALOG.SCHEMA where Databricks must store write-back data from Sigma objects, including input tables, input table edit logs, warehouse views, materializations, CSV uploads, and usage data from Sigma Assistant.
  3. (Optional) Enter additional destinations as needed, depending on how you want to partition the data that Sigma writes back to your data warehouse. For example, you might create separate destinations for different teams and set up your team and schema permissions to ensure that each team has access to write to their designated destinations.

Configure connection features

In the Connection features section, specify the following:

  1. In the Connection timeout field, specify the amount of time, in seconds, that Sigma should wait for the query to return results before timing out. The default in 120 seconds. The maximum is 600 seconds (10 minutes).

  2. [optional] If you do not want Sigma to automatically make column names from the data source more readable, turn off the Use friendly names switch. For example, with Use friendly names turned on, a catalog column ORDER_NUMBER or OrderNumber appears as Order Number.

  3. [optional] If you want to see and use your hive_metastore catalog in Sigma, turn on the Enable Hive metastore switch . Turned off by default.

Enable Python

To enable Python on your Databricks connection, follow the steps to Set up a Databricks connection for Python, including the Databricks configuration steps.

Finish creating your connection

After you specify all the parameters of the connection, click Create.

  1. Click Create at the top right of the screen to create your connection. Sigma displays a connection summary on the screen.

  2. Click Browse connection, then click Add permission to grant connection access for users in your organization. See Data access overview.

    The Permission summary on the connection, showing that no users have access to this connection yet.
  3. Use the navigation in the left panel to explore the schemas and tables in your connection.

    The browse connection view, showing a table available through the connection

Databricks Partner Connect

Databricks is one of Sigma's partners, so you can quickly establish a connection through the interface. See What is Databricks Partner Connect? in the Databricks documentation.