External storage integration overview

An external storage integration uses a customer-owned bucket1 in Amazon S3, Google Cloud Storage (GCS), or Azure Blob Storage to support the following features:

  • CSV upload
  • File upload column in input tables
  • Export to cloud storage

Some Sigma features (like file upload columns) have storage flows that require a customer-owned bucket and cannot be used when an external storage integration isn't configured. Other features (like CSV upload) have default storage flows that use a Sigma-owned bucket, but you can enable the use of a customer-owned bucket instead.

This document explains the general and feature-specific advantages of an external storage integration using a customer-owned bucket. For information about configuring a storage integration using a specific cloud provider, see the following documentation:

1This document uses "bucket" as a generic term for the top-level storage unit in all cloud storage providers. While Amazon S3 and GCS officially use the term "bucket", Azure Blob Storage uses the term "container" to describe the same concept. The storage integration configuration documentation for each cloud provider uses the terminology specific to that provider.

General advantages of using a customer-owned storage bucket

When a feature uses a Sigma-owned bucket to stage, cache, and store files, you cannot see, manage, or access the bucket. These restrictions can conflict with your company's security and compliance requirements. When you choose to use a customer-owned bucket, however, your company gains full control over the following:

  • Data location (where files live)
  • IAM and RBAC policies (who has file access)
  • TTL and lifecycle rules (how long files are retained)
  • Encryption configuration and keys (how files are encrypted)

This level of access can be necessary if your company has strict compliance requirements, needs to maintain full control over its data, or wants to customize their storage experience to align with their existing infrastructure and security policies.

Feature-specific advantages of using a customer-owned bucket

There are also feature-specific advantages to using a customer-owned bucket. The following table compares each supported feature's default storage flow (without a storage integration) to the customer-owned bucket storage flow (with a storage integration).

FeatureWithout storage integrationWith storage integration
CSV upload2Staging files are temporarily stored in a Sigma-owned bucket before loading to your data platform. Sigma controls the bucket region and lifecycle (24-hour TTL).Staging files are temporarily stored in the customer-owned bucket before loading to your data platform. Your company controls the bucket region and TTL. This helps your organization meet security and compliance requirements that could otherwise block the use of CSV upload.
File upload columnFile upload columns cannot be used.Long-lived files that can contain sensitive information are stored in the customer-owned bucket, which offers control over data management, security, and compliance.
Export to cloud storage2Your data platform builds the export file and writes it to a customer-owned bucket using its own storage integration. This data flow can introduce platform-specific formatting and other inconsistencies in comparison to what users see in Sigma.Sigma can build the export file and write it directly to the customer-owned bucket. This data flow applies platform-agnostic export logic that results in a cleaner and more consistent export format that aligns with what users see in Sigma.

2CSV uploads and exports require additional configurations to use the storage integration. For more information, see Configure CSV upload and storage options.

Storage integration configuration requirements

To configure a storage integration using a customer-owned bucket, see the documentation about your cloud provider: