Available for Enterprise Edition only.
- Provides sample customer rows and order rows as input
- Defines the expected joined output rows
- Verifies that the join condition works correctly
Prerequisites
Unit tests are available only for Simplified PySpark projects.Types of unit tests
You can configure two types of unit tests on gems.- Output rows equality: Compares the actual output rows against a saved snapshot of expected data.
- Output predicates: Evaluates Spark expressions against output data to verify business rules and constraints.
Output rows equality
Output rows equality tests compare the actual output rows against expected data you define. Use this test type when you need to verify that transformations produce identical results.- Open the gem you want to test.
- Click Unit Tests in the gem configuration.
- Click Create Test to add a new unit test.
- In the Settings section, select Output rows equality from the dropdown.
- Click one or more columns in the left panel to add them to the Selected Columns table.
- Click Create.
- Define expected input data:
- Select an input port tab, such as
in0orin1. - Select the correct data type for each column you are testing.
- Click + Add Row to add expected input rows.
- Enter values for each column.
- Select an input port tab, such as
- Define expected output data:
- Select the
outport. - Select the correct data type for each column you are testing.
- Click + Add Row to add expected output rows.
- Enter values for each column.
- Select the
- Click Done to save the unit test.
Output predicates
Output predicates let you define expressions that must evaluate to true for the test to pass. Use predicates when you need to validate business rules, data constraints, or complex conditions rather than exact row matches.- Open the gem you want to test.
- Click Unit Tests in the gem configuration.
- Click Create Test to add a new unit test.
- In the Settings section, select Output predicates from the dropdown.
- Click one or more columns in the left panel to add them to the Selected Columns table.
- Click Create.
- Define expected input data:
- Select an input port tab, such as
in0orin1. - Select the correct data type for each column you are testing.
- Click + Add Row to add expected input rows.
- Enter values for each column.
- Select an input port tab, such as
- Add predicates for the output:
- In the predicates table, enter a Predicate Name in the first column. Use descriptive names that indicate what the predicate validates.
- Enter an expression in the Expression column. The expression must evaluate to a boolean value and return
truefor the test to pass. - Click in an empty row below to add additional predicates if needed.
- Click Done to save the unit test.
Example predicates
Review the following example predicates to help you understand how to write predicates.| Predicate Name | Expression |
|---|---|
| Amount is positive | amount > 0 |
| First name differs from last name | first_name != last_name |
| Order date in valid range | order_date >= '2024-01-01' AND order_date <= '2024-12-31' |
true for the test to pass.
Generate sample data automatically
Enable automatic data generation to create test input data without manually entering rows. This option generates sample rows from upstream data.- In the unit test configuration, toggle on Generate Data.
- Enter the number of rows to generate in the Rows field. Prophecy samples this many rows from the input data.
- Click Create to generate the sample input data.
- Review the generated sample input data. Edit the data if needed.
- Click Done to save the unit test.

