Connect to Databricks
This document provides instructions on how to configure a Databricks connection with Sigma.
Contents
- Requirements
- Limitations
- Configure Databricks
- Configure Sigma
- Databricks Partner Connect
- Related resources
Requirements
- Admin privileges in your Sigma organization; see Account types.
- Admin privileges in Databricks
- We recommend that you avoid granting excessive permissions to the account you use when connecting to your data store; for example, you do not require SYSADMIN-level access.
Limitations
When connecting from Sigma to a Databricks data store, Sigma does not support the following features and interactions:
- Dataset warehouse views
- Connecting to the warehouse through OAuth
- Support for function ArrayDistinct
Configure Databricks
Complete the following steps in Databricks before you add a Databricks connection to Sigma.
- Create a Databricks SQL endpoint if one doesn't already exist; see Databricks documentation on their specific SQL dialect. Databricks SQL endpoints are only available for Databricks accounts on their Premium plan or higher.
- We recommend that you disable Auto stop on your Databricks endpoint. In the Databricks SQL menu, navigate to SQL Warehouses and select the endpoint to use in Sigma.
-
Click Edit in the more ••• menu and turn off Auto stop.
If Auto stop is enabled for your Databricks SQL endpoint, then your first query will be slow or timeout when the SQL endpoint is in a suspended state.
- Go to the SQL Admin Console to grant privileges to Databrick users or groups to add to Sigma.
- For most use cases, SELECT privileges on tables and USAGE privileges on schemas/catalogs are sufficient.
- You need to grant SELECT privileges for each table in a schema, even if the user has USAGE privileges on the schema.
- Tables in the catalog
hive_metastore
requireREAD_METADATA
privileges. - To enable the use of writeback features, such as Materialization, CSV Upload, and other forthcoming write features, you must grant
MODIFY
privileges on a write schema. - Unity Catalog is Databricks’s newest catalog and is currently in public preview. Eventually, it will replace Hive Metastore. If Unity Catalog is enabled, Sigma can access catalogs, schemas, and tables managed by Unity Catalog.
Configure Sigma
Before setting up a connection in Sigma, create an access token, and gather the Server hostname and HTTP path from your Databricks Starter Endpoint:
Follow these basic steps to create a connection:
- Click the user icon at the top right of your screen.
The user icon is usually composed of your initials. - In the drop-down menu, select Add connection.
- The Add new connections page appears.
- In the Connection details, specify these:
- Name
- Specify the Name of the new connection. Sigma displays this name in the connection list.
- Here, we use Databricks connection.
- Type
- Select the tile that represents the data warehouse you use.
- Here, click the Databricks tile.
- In the Connection Credentials section, specify the following:
- Host
- The address of your database.
- This can be a URL, or an IP address.
- HTTP path
- Enter the HTTP path of your Databricks cluster here.
- Access token
- Generate a token in Databricks, using these instructions, and enter its value here.
- For example,
t457e0fo09iv7rwqnkof124084sou9jt
.
- In the Connection Features section, specify the following:
- Connection timeout
- The time before timeout (or cancellation), in seconds, that Sigma waits for the query to return results.
- Default is 120, or 2 minutes.
- Maximum is 600, or 10 minutes.
- Use friendly names
- This switch makes column names from the data source more readable.
- For example, a database column ORDER_NUMBER appears as Order Number.
- On by default.
- In the Write Access section, decide if you require write access.
See Set up write access.
- Enable write access
- Necessary for CSV upload and Materialization.
- Off by default.
- If on, specify the Write schema field.
- Write catalog
- The catalog where Sigma writes tables.
- The first layer of the object hierarchy, used to organize your data assets.
- Appears only if Enable write access is on.
- Write schema
- The schema where Sigma writes tables.
- The second layer of the object hierarchy, and contain tables and views; also known as databases.
- Appears only if Enable write access is on.
- After you specify all the parameters of the connection, click Create.
- After you successfully create your connection, Sigma displays it on the screen.
- To verify your connection, click Browse Connection, and then explore the visible databases and tables.
- Click Add Permission to grant data access for users in your organization.
See Data permissions. - The new connection also appears in the list of connections you have in your account.
Databricks Partner Connect
Databricks is one of Sigma's partners, so you can quickly establish a connection between them through the Databricks Partner Connect interface.
Related resources
- Databricks Connect in Databricks documentation
- Databricks Partner Connect