Skip to main content
Dependencies:
  • ProphecySparkBasicsPython 0.0.1+
  • ProphecySparkBasicsScala 0.0.1+
Cluster requirements:
  • UC dedicated clusters 14.3+ supported
  • UC standard clusters 14.3+ supported
  • Livy clusters 3.0.1+ supported
Filters a DataFrame based on the provided filter condition.

Parameters

ParameterDescriptionRequired
DataFrameInput DataFrame on which the filter condition will be applied.True
Filter ConditionBooleanType column or boolean expression. Supports SQL, Python and Scala expressions.True
Use the visual language syntax to call configuration variables in the Filter gem.

Example

In this example, the Filter gem is used to return only marketing orders that are either finished or approved, while excluding any orders that have been discounted. Example usage of Filter
The Filter gem configuration translates into the Spark code shown below, which applies the same filtering logic.

Spark code

def Filter_Orders(spark: SparkSession, in0: DataFrame) -> DataFrame:
 return in0.filter(
 (
 ((col("order_category") == lit("Marketing"))
 & ((col("order_status") == lit("Finished")) | (col("order_status") == lit("Approved"))))
 & ~ col("is_discounted")
 )
 )