Skip to main content
When you migrate an Alteryx workflow to a Prophecy pipeline, the transpiler first reads uploaded files. Then, the transpiler internally maps the Alteryx components to the Prophecy-supported gems to create the visual pipeline and open-source code. The following tables list all the Alteryx components that Prophecy supports, and the corresponding Prophecy gems they map to in a Spark project:

Input and Output

Alteryx ToolEquivalent Prophecy GemDescription
BrowseData ExplorerLets you explore and analyze data samples directly within the UI.
DirectoryDirectoryReturns a listing of all files in a directory with metadata such as creation time, type, and size.
Input DataSourceFacilitates data ingestion from supported sources, including schema inference and preview.
Output DataTargetAllows saving transformed data to various destinations, databases, or file formats.
Text InputSourceProvides a seed source option for manually inputting data to create small datasets.
Date Time NowSchema TransformReturns the current timestamp or date at the start of pipeline execution.

Preparation

Alteryx ToolEquivalent Prophecy GemsDescription
Create SamplesRowDistributorSamples records on multiple output ports based on given criteria.
Data CleansingDataCleansingStandardizes data formats and addresses missing or null values in the data.
FilterFilterFilters data based on the provided filter condition.
FormulaReformatTransforms one or more column names or values by using expressions and functions.
Generate RowsSchemaTransform (Spark UDF)Calls Spark UDF with explode() to create new rows of data.
ImputationAggregate + Join + ReformatReplaces specific values with an aggregate of the column such as min, avg, and max using Join and Aggregate Gems.
Multi-field BinningReformatTiles or bins multiple fields by assigning bin numbers using aggregates and expressions.
Multi-field FormulaBulkColumnExpressionsRenames, updates, or changes the type of a set of columns.
Multi-row FormulaReformatUpdates rows with sequential transformations inside a Reformat gem.
Oversample FieldFilter + SetOperationUses Filter and Union operations to oversample fields with appropriate counts.
Random % SampleSampleRowsGenerates a random sample of records, optionally grouped on specific fields.
Record IDSequenceGenerates IDs for each record starting from an initial value and incrementing by a fixed amount.
SampleSampleRowsSamples records by selecting a specific number or percentage of records.
Select RecordsSequence + FilterCombines Sequence and Filter Gems to generate record IDs and choose subsets of records.
SelectReformatIncludes, excludes, reorders, casts, or renames columns.
SortOrderBySorts incoming records based on one or more selected fields.
TileWindowFunctionAssigns tile values to each record using the ntile() function.
UniqueDeduplicateFilters unique rows based on specified columns.

Join

Alteryx ToolEquivalent Prophecy GemsDescription
Append FieldsJoinWhen the condition is true, performs Inner/Outer Join.
Find ReplaceReformatFinds and replaces string matches in a column.
JoinJoinJoins two data sources based on a common field.
Join MultipleJoinJoins multiple data sources on a common field.
Make GroupScriptIdentifies relationships across fields using custom logic.
UnionSetOperationCombines records from multiple data sources into one output.

Parse

Alteryx ToolEquivalent Prophecy GemsDescription
DateTimeReformatUses data functions such as current_date() to manipulate date and time fields.
RegExReformat or SchemaTransformUses regex functions such as regex_replace() and regex_extract() for parsing data.
Text to ColumnsScriptImplements custom parsing logic to split text into multiple columns.
XML ParseColumnParserParses input JSON or XML fields with schema inference or predefined schema.

Transform

Alteryx ToolEquivalent Prophecy GemsDescription
ArrangeReformatRearranges columns in the dataset.
Count RecordsReformatUses the count() function to count records in a dataset.
Cross TabAggregatePivots data on unique values of a specified column.
Make ColumnsReformat + WindowFunctionWraps rows of data into columns.
Running TotalWindowFunctionCalculates a cumulative sum on a numeric field using unbounded precedence.
SummarizeAggregatePerforms functions and calculations on data (excluding domain-specific actions).
TransposeScriptUses pivoting logic to change data orientation.
Weighted AverageAggregateComputes a weighted average using required expressions.

Lab

Alteryx ToolEquivalent Prophecy GemsDescription
JSON BuildScriptBuilds JSON using functions from the ProphecyLibs package.
Transpose in-DBAggregateUses pivoting logic to transform data orientation within the database.

Developer

Alteryx ToolEquivalent Prophecy GemsDescription
Base64 EncoderReformatEncodes a string field as Base64 using the base64() function.
DownloadRestAPIEnrichEnriches a DataFrame by adding columns from REST API output.
Dynamic InputTableIteratorIterates over Gems for each row of the input DataFrame.
Dynamic RenameBulkColumnRenameRenames multiple columns systematically.
Dynamic ReplaceDynamicReplaceDynamically replaces field values based on user-defined conditions.
Dynamic SelectDynamicSelectDynamically filters columns based on conditions.
JSON ParseColumnParserParses JSON or XML into a tabular format.
PythonScriptLets you write custom Python code for transformations.
TestScriptCompares data using custom test logic.

Macros

Alteryx MacroEquivalent Prophecy GemsDescription
StandardSubgraphEncapsulates multiple Gems within a single, reusable parent Gem for modular pipelines.
BatchTableIteratorIterates over a subgraph for each input row.
IterativeWhileIteratorRecursively processes rows until the result set is empty or a maximum number of iterations is reached.
If you don’t see a particular component on this list, you can include custom abilities beyond custom scripting. To learn more, see Gem Builder for Spark.