group by, sum, count, median, concatenate, first, last, and spatial summarizations) to selected fields, producing a summarized output table whose columns are defined by each action’s Output Field Name.
When Import encounters an Alteryx Summarize tool, it analyzes the selected actions and converts the Summarize tool into either an Aggregate, a Deduplicate gem, or a Reformat gem.
Automated migration results
When Prophecy imports a Summarize tool, it applies the following logic:- Summarize tools that include grouping and aggregation actions are migrated to an Aggregate gem.
- Summarize tools that include grouping without additional aggregation are migrated to a Deduplicate gem.
- Summarize tools that include non-grouped aggregation or positional actions are typically migrated to a Reformat gem.
Manually replicate in Prophecy
When recreating an Alteryx Summarize step manually in Prophecy, you do not need to mirror the internal structure generated by Import. Instead, choose the Prophecy gem that best matches the intent of the Summarize action. As a general guideline:- Use the Aggregate gem when your goal is to reduce rows using grouping and aggregation (for example, sums, counts, medians, percentiles, or spatial aggregations).
- Use the Deduplicate gem when your goal is to retain one row per unique key without computing additional metrics.
- Use the Reformat gem for non-grouped aggregations or positional summarizations that collapse the dataset to a single row.
Configuration options
Alteryx Summarize tool
In Alteryx, the Summarize tool is configured by selecting one or more actions for individual fields. Each action defines both the operation to perform (for example, group by, sum, count, concatenate, first, median, or spatial and financial summarizations) and the resulting Output Field Name. Multiple actions can be applied to the same field, and actions may be combined across fields. The presence or absence of group by actions determines whether the output is grouped (one row per unique key) or non-grouped (a single summary row). Alteryx also applies implicit semantics for blanks, nulls, ordering, and data types, which influence how summarization results are computed.Prophecy gems
In Prophecy, Summarize behavior is represented using one of several gems depending on the intent of the Alteryx configuration:- Use the Aggregate gem to apply aggregations to grouped rows. You configure Group by columns in the gem, and target aggregations to specific output columns.
- Use the Deduplicate gem to group rows without aggregations. One row is retained per unique key defined by the grouped columns.
- Use the Reformat gem for non-grouped summarizations and positional actions that collapse the dataset to a single row.
Output behavior
Summarize actions typically reduce the number of rows in the dataset. The resulting output contains:- One row per unique group when grouping is specified
- A single row when no grouping is specified
Known caveats
- Explicit expressions. Import preserves Alteryx Output Field Names as Prophecy output columns whenever possible. To match Alteryx semantics, especially around blank and null handling, Import may generate more explicit expressions than a user would typically write by hand (for example, conditional logic around
count, string aggregation, or spatial functions). - Ordering-dependent actions. Summarize actions such as first and last depend on record ordering. In Databricks SQL, ordering is not guaranteed unless explicitly defined, so results may differ from Alteryx when input order is significant.
- Blank vs null semantics. Alteryx distinguishes between blank and null values. Databricks SQL does not make the same distinction by default, so imported expressions may include explicit logic to preserve Alteryx behavior.
- Statistical functions. Some statistical summarizations (such as percentiles, variance, or mode) rely on Databricks SQL’s native implementations. Minor differences in definitions or tie-breaking behavior may occur.
- Financial calculations. Financial Summarize actions (such as
npvorirr) are not supported directly and will require manual adjusting after import.
Examples
An input dataset contains sales records with the following fields:RegionOrderDateOrderAmountOrderID
Example: Grouped aggregation with multiple actions
Goal: Produce a region-level summary table with one row per region that reports total sales, number of orders , and the most recent order date.Alteryx (Summarize tool)
In Alteryx, configure a Summarize tool with these actions:- Group By:
Region - Sum:
OrderAmount(Output Field Name:TotalSales) - Count:
OrderID(Output Field Name:OrderCount) - Max:
OrderDate(Output Field Name:LastOrderDate)
Region, with total sales, number of orders, and the most recent order date.
Prophecy (Aggregate gem)
To reproduce this logic manually in Prophecy:- Add an Aggregate gem to the canvas.
-
Under Group By Columns, select
Region. -
Under Expressions, use the Visual Builder to add the following columns:
- Target column
TotalSalesconfigured as asumfunction with argumentOrderAmount. - Target column
OrderCountconfigured as acountfunction with argumentOrderID. - Target column
LastOrderDateconfigured as amaxfunction with argumentOrderDate. - Target column
Regionconfigured asRegion.
- Target column
Example: Grouping without aggregation
Goal: Reduce the dataset to one row per unique order id.Alteryx (Summarize tool)
Configure the Summarize tool with a single action: Group By:OrderID
Prophecy (Deduplicate gem)
- Add a Deduplicate gem to the canvas.
- Select Distinct Rows.
- Add
OrderIDas a target column, configured asOrderID.
Example: Non-grouped summarization
Goal: Collapse the entire dataset into a single-row summary containing overall totals.Alteryx (Summarize tool)
Configure the Summarize tool to include no Group By actions and:- Sum:
OrderAmount(Output Field Name:TotalSales) - Count:
OrderID(Output Field Name:OrderCount)
Prophecy (Reformat gem)
Use a Reformat gem:- Add a Reformat gem with no grouping.
- Under Expressions, use the Visual Builder to add the following columns:
- Target column
TotalSalesconfigured as asumfunction with argumentOrderAmount. - Target column
OrderCountconfigured as acountfunction with argumentOrderID.
- Target column

