Summarize
Summarize groups by dimensions and computes aggregate functions per group. Common functions:- sum, avg, min, max
- count, count distinct
- stddev / variance (when supported)
- arbitrary / first / last value picks with ordering (product-dependent)
- Group keys: Columns that define the grain (
date,region,product_id). - Measures: One entry per output metric (
revenue = sum(line_total)). - Filters: Optional predicate applied before aggregation (HAVING-equivalent may appear separately).
- Group by
order_date,region sum(order_total)asrevenuecount_distinct(order_id)asorders
Pivot
Pivot spreads values from a name column across new columns—turning long tidy data into wide report shapes. Configuration:- Index / row keys: Identifiers that stay as rows (
product_id). - Pivot column: Field whose distinct values become headers (
quarter). - Values column: Numeric or text field to place in cells (
revenue). - Aggregation: How to combine multiple rows that map to the same cell (
sum,max).
(sku, month, units_sold); pivot month to columns 2024-01, 2024-02, … for a spreadsheet-style export.
Unpivot
Unpivot (melt) does the inverse: take many measure columns and produce name / value pairs. Configuration:- Id columns: Keys to retain per row (
store_id,date). - Measure columns: The wide metrics to stack (
sales_q1,sales_q2, …). - Output names: Typically
metric_nameandmetric_value.
Choosing summarize vs pivot first
Long data for analytics
Long data for analytics
Keep fact tables long (entity, metric name, value) when many metrics share dimensions; aggregate with Summarize after filters.
Wide data for exports
Wide data for exports
Pivot late in the graph—right before a Write to spreadsheets or systems expecting fixed columns.
Repairing wide sources
Repairing wide sources
Unpivot early to normalize messy extracts, then join to dimension tables on
metric_name.Related topics
Column transforms
Use window functions when you need row-level outputs with running metrics.
Destinations
Load aggregates to warehouses or marts.