Hash Repartitoning
Repartitions the data evenly across various partitions based on the hash value of the specified key.Parameters
| Parameter | Description | Required |
|---|---|---|
| DataFrame | Input DataFrame | True |
| Overwrite default partitions | Flag to overwrite default partitions | False |
| Number of partitions | Integer value specifying number of partitions | False |
| Repartition expression(s) | List of expressions to repartition by | True |
Compiled code
Random Repartitioning
Repartitions without data distribution defined.Parameters
| Parameter | Description | Required |
|---|---|---|
| DataFrame | Input DataFrame | True |
| Number of partitions | Integer value specifying number of partitions | True |
Compiled code
Range Repartitoning
Repartitions the data with tuples having keys within the same range on the same worker.Parameters
| Parameter | Description | Required |
|---|---|---|
| DataFrame | Input DataFrame | True |
| Overwrite default partitions | Flag to overwrite default partitions | False |
| Number of partitions | Integer value specifying number of partitions | False |
| Repartition expression(s) with sorting | List of expressions to repartition by with corresponding sorting order | True |
Compiled code
Coalesce
Reduces the number of partitions without shuffling the dataset.Parameters
| Parameter | Description | Required |
|---|---|---|
| DataFrame | Input DataFrame | True |
| Number of partitions | Integer value specifying number of partitions | True |

