Prepare a cohort analysis

Cohort analysis involves identifying groups with similar characteristics - cohorts - and analyzing data across the them. Cohort analysis is commonly used to analyze patterns of user behavior over time. 

This  tutorial focuses on cohort analysis and covers:

  • Table manipulations
  • Functions for helping to determine cohorts, including BinFixed
  • Table formatting
  • Bar chart formatting

Contents

Requirements

Part 1 - Update the Base Table

Part 2 - Create the Base Revenue Cohort table

Part 3 - Create the Revenue Cohort table

Part 4 - Visualize cohorts in a bar chart

Related resources

Requirements

  • Your user account must be a Creator or Admin.
  • A workbook with a Data page that includes a table based on the PLUGS sample data. Use the tutorial Get started with a workbook to set this up. 

Part 1 - Update the Base Table

In Part 1 you modify the Base Table by creating columns, truncating dates, performing a group by, and building aggregate calculations. You achieve this through simple UI interactions and spreadsheet style formulas, and without writing a single line of SQL. The table's changes apply to all future workbook elements that use the table, such as visualizations.

It is a Sigma best practice to start your workbook analysis using a base table, and create child elements from the base table. This provides maximum flexibility and control when adding filters, creating columns that are reused, or using parameters that will effect multiple elements. 

When you complete Part 1, you will have a strong foundation with which to further build a cohort analysis in Part 2.

  1. Open the workbook you previously created at Get started with a workbook.
  2. In the Data page, select the Base Table. 
  3. Truncate the [Date] column to Day.
    • Select the [Date] column.
    • From the column's menu, select Truncate date > Day.
      truncate-date-to-day.png
      Sigma truncates the date to show the data, by default in yyyy-mm-dd format.
    • From the column's menu, select Rename column, or press Shift-R. 
    • In the column's header, enter "Purchase Date". 
    • At this point, confirm that you have the new Purchase Date column next to the first Order Number column. It should look like this:
      new-purchase-date.png
  4. Change the format of the [Cost] column to Currency.
    • Scroll right and select the [Cost] column.
    • Change its format to Currency by either:
      • Using the column menu to select Format > Currency.
      • Clicking the toolbar currency icon.
        format-cost-as-currency.png
  5. Change the format of the [Price] column to Currency, following the same process as for [Cost] in Step 2.
  6. Create a new [Revenue] column.
    • Select the [Price] column.  
    • From its column menu select Add new column or press Shift +. 
      price-add-new-column.png
      Sigma adds a new column next to [Price] and names it [Calc].
    • By default the new column is selected, but if it is not, be sure to select [Calc].
    • In the formula bar, enter the following and press Return:  
      [Quantity] * [Price]
      Sigma calculates the formula and populates the column with the results, which are already in currency format. 
    • From the column's menu, select Rename column, or press Shift-R. 
    • In the column's header, enter "Revenue" and press Return. 
    • At this point, you should see the reformatted the [Cost] and [Price] columns and have a new [Revenue] column. It should look like this:
      new-revenue-column.png
  7. Add a Cost of Goods Sold (COGS) column.
    • Select the [Price] column and follow the same process as Step 4, except:
      • Use the following formula:
        [Quantity] * [Cost]
      • Rename the column 'COGS".
  8.  Add a Profit column.
    • Select the [Price] column and follow the same process as Step 4, except:
      • Use the following formula:
        [Revenue] - [COGS]
      • Rename the column "Profit".
  9. At this point, you should see all the new columns. It should look like this:
    new2-plugs-table.png
  10. Create a grouping by customer key.
    • Scroll right and select the [Cust Key] column.
    • From its menu select Group column or press Shift G. 
      Sigma groups the data in the column and moves it all the way to the left of the table. Also, the GROUPINGS section in the left panel shows the new GROUP BY information. This is the starting point for building aggregate calculations. 
      groupby-custkey.png
  11. Build an aggregate calculation for the first (minimum) purchase date.  
    • Select the [Cust Key] column.
    • From its menu select Add new column or press Shift +. 
      TIP: When you add a column next to an existing column, it ensures that the new column is in the same grouping level as the existing column. You want these new aggregations to be calculations in the [Cust Key] grouping. Alternatively, after you create the new column you can locate the column in the left panel and drag it into the CALCULATIONS  section of the [Cust Key] grouping, or click the “Add Calculation” + menu. 
    • For the new column enter the formula:
      Min([Purchase Date])
      Sigma calculates the values and populates a new column named [Min of Purchase Date]. These results show the first purchase date of each customer. 
    • Rename the column "First Purchase Date".
  12. Build an aggregate calculation for customer revenue. 
    • Repeat the process from Step 9, except:
      • Use the formula:
        Sum(Revenue)
        Sigma calculates the values and populates a new column. These results show the sum of revenue for each customer. 
      • Rename the new column "Customer Revenue". 
  13. At this point, you should have the new grouping and calculations. It should look like this:
    first-grouping.png
  14. Collapse all of the columns by clicking the minus sign (-) next to the [Cust Key] title. Sigma changes the minus sign to a plus + sign.
    grouped-table-collapsed.png
    Previously you worked with the [Cust Key] grouping expanded, which enabled you to see the line item level details that feed the aggregate level calculations. In Sigma the underlying records are always  accessible in no more than a few clicks.
  15. (Optional) See query history.
    At this point you have performed a grouping and built two aggregate calculations. Sigma translated these steps into machine-generated SQL, which queried the CDW. The SQL that Sigma generates is the equivalent of the following simplified query:
    Select Cust Key, min(purchase date) as First Purchase Date, sum(revenue) as 
    Customer Revenue from Table group by Cust Key
    To see the actual SQL that Sigma generates, at the top right of the menu bar click the query icon and select Query history
    query-history-menu.png
    In the Queries modal, click an event in the left panel to see the SQLqueries-modal.png
  16. (Optional) See column details.
    • Select the [Customer Revenue] column.
    • From the column's menu select Column Details or press Cmd/Ctrl- I.
      Sigma opens a modal with profile information about the data, including metrics such as row count, distinct count, and null count, as well as statistical metrics. This is a very helpful tool to quickly and efficiently view the data in any given column. Also notice the minimum and maximum values.
      column-details.png
  17. Calculate a bin metric that will group values together based on their distribution into specified ranges.
    • Select the [Cust Key] column.
    • From menu select Add new column or press Shift +.
    • Add the formula:
      BinFixed([Customer Revenue], 300, 1000000, 10)
      The BinFixed formula organizes your data into the number of “bins” you are trying to analyze. The inputs for this formula are:
      [value] (required): The value or field name, in this case the Customer Revenue field, for which the bin is computed.
      min (required): The lower bound. For any value less than this the bin will be 0.
      max (required): The upper bound. For any value greater  than this the bin will be Bins+1.
      bins (required): The number of bins into which to split the value.
      In this example, the min and max are 300 and 1,000,000 respectively and the values are split into 10 bins. You effectively split the customers into deciles which you will leverage later on in the analysis.
      Tip: 300 and 1,000,000 were chosen for simplicity. You could also do a nested formula:
      BinFixed([Customer Revenue], min([Customer Revenue], Revenue] , 10)
      Both methods are valid, with the nested formula being more dynamic.
    • Rename the column "Customer Revenue Bin".
    • Reformat the column as Whole number. 
  18. Review the grouped calculations. Here is what the table and its grouping should look like, collapsed:
    grouping-with-bins.png

Part 2 - Create the Base Revenue Cohort table

Next, use the data from the Base Table to create a new table on a new page.

  1. On the Data page, expand the [Cust Key] column to show all the data by clicking the + icon in the column title. 
  2. At the right of the Base Table, hover over the corner to see the table's menu.
  3. Select Create child element > Table.
    child-table-menu.png
    Sigma creates a new table below the Base Table.
  4. Rename the new table "Base Revenue Cohort".
    • Double-click the table's title and enter the new name.

Part 3 - Create the Revenue Cohort table

  1. From the new Base Revenue Cohort table's menu, select Create child element > Table.
  2. Rename the table "Revenue Cohort".
    Now you have three tables on the Data page.
  3. From the new Revenue Cohort table's menu, select More > Move to > New page.
    Sigma creates a new page called "Page 1" and places the Revenue Cohort table on it. 
  4. From the page menu at the bottom left, rename the new page "Revenue Cohort".
  5. Group the store regions.
    • Scroll right and select the [Store Region] column.
    • From the column menu select Group Column or press Shift G. 
      Sigma moves the [Store Region] column to the left side of the table. By default the regions are expanded.
  6. Add a column for region revenue.
    • From the [Store Region] column's menu, select Add new column or press Shift +. 
    • Add the formula:
      Sum([Revenue])
    • Rename the column "Region Revenue". 
      The new columns should look like this:
      store-region-grouping.png
  7. Group the customer revenue bins.
    • Scroll right and select the [Customer Revenue Bin] column. 
    • From it's column menu select Group Column or press Shift G. 
      Sigma creates the new column in a new grouping.
      cust-rev-bin-grouping.png
  8. Add a new column for bin revenue.
    • Select the [Customer Revenue Bin] column.
    • From its menu select Add new column or press Shift +. 
    • Enter the formula:
      Sum([Revenue])
    • Rename the column "Bin Revenue". 
  9. Add a new column to rank the bin.
    • Select the [Bin Revenue] column.
    • From its menu select Add new column or press Shift +. 
    • Enter the formula:
      Rank([Bin Revenue], "desc")
    • Rename the column "Bin Rank". 
      At this point your table should look like this:
      table-with-bin-rank.png
  10. Add a percentage-of-region column.
    • Select the [Bin Rank] column.
    • From its menu select Add new column or press Shift +. 
    • Enter the formula:
      [Bin Revenue] / [Region Revenue]
    • Rename the column "% of Region". 
    • Format the column as a Percentage. 
  11. Collapse the [Customer Revenue Bin] grouping.
    The table should look like this:
    collapse-cust-rev-bin-grouping.png
  12. (Optional, bonus feature) Add data bars to the [% of Region] column.
    • Select the [% of Region] column.
    • From its menu select Conditional formatting.
      Sigma opens the formatting panel on the left. 
    • Select DATA BARS.
      cond-format-data-bars.png
      Sigma applies the data bars to the column:
      percent-region-databars.png
  13. (Optional) Apply sorting.
    • Use a column's menu to apply sorting. For example, sort [Bin Revenue] descending:
      sort-bin-revenue.png

Part 4 - Visualize cohorts in a bar chart

  1. Go to the Data page.
  2. From the Revenue Cohort Base Table's top right hover menu, select Create child element > Visualization.
    create-child-visualization.png
    Sigma creates a new, default, empty bar chart called "New Bar Chart".
  3. Double-click the title and rename the chart "Revenue by Store Region and Customer Bin". 
  4. From the Revenue Cohort Base Table's top right hover menu, select Move to > Revenue Cohort.
    Sigma moves the table and opens the Revenue Cohort page.
  5. In the left panel, drag and drop the columns to the sections above, as follows:
  6. Move [Store Region] to the X-AXIS section.
  7. Move [Revenue] to the Y-AXIS section. 
  8. Move [Customer Revenue Bin] to the MARKS > COLOR section.
    The table should look something like this:
    rev-by-storeregion-customerbin.png
  9. (Optional) Rearrange the Revenue Cohort page.
    • Select the bar chart and drag-and-drop it next to the table. 
    • Select the bar chart and, in the left panel, change its orientation to horizontal.
      horizontal-barchart.png
    • Add a page title using the Text element.
      • Click ADD NEW ELEMENT > UI ELEMENTS > TEXT.
      • Enter a page title.
        Format the title using the formatting options at the top of the page.
      • Drag and drop the new page title to the top of the page.
      • Your new page could look something like this:
        reformat-with-title.png

Related resources


Was this page helpful?
Yes No