This gem runs in .
sales fields with the average of non-null sales so those missing values do not distort later calculations.
Prerequisites
Addprophecy_basics package version 1.0.11 or higher to your project.
Parameters
| Parameter | Description |
|---|---|
| Fields to impute | Select one or more numeric fields to update. |
| Incoming value to replace | Choose which value should be replaced in the selected fields.
|
| Value to Replace | Enter the value to replace when Incoming value to replace is set to User-specified value. |
| Replace with value | Choose the value used for replacement.
|
| Replacement value | Enter the replacement value when Replace with value is set to User-specified value. |
| Include imputed value indicator field | Add an indicator field for each imputed column that shows whether a value was imputed. |
| Output imputed values as a separate field | Keep the original field unchanged and write the imputed result to a new column. (Non-imputed values are automatically included in column.) |
How it works
The Imputation gem scans the selected fields and looks for values that match the configured Incoming value to replace setting. Matching values are replaced using the selected method:- Average: Replaces with the mean of valid values in the field, excluding the value being replaced.
- Median: Replaces with the middle value in the field, excluding the value being replaced.
- Mode: Replaces with the most frequently occurring value in the field, excluding the value being replaced.
- User-specified value: Replaces with the value you provide.
Output
By default, the output contains the original data stream with imputed values written back into the selected fields. When Include imputed value indicator field is enabled, an additional field is added for each imputed field to indicate whether the value was imputed. Naming pattern:<original_field>_Indicator
When Output imputed values as a separate field is enabled, the original field is preserved and a new field is added with the imputed result.
Naming pattern: <original_field>_ImputedValue
If both options are selected, both additional fields are included.
Notes
- This gem works for numeric fields.
- Imputation is calculated separately for each selected field.
- When using Average, Median, or Mode, the replacement statistic is calculated using valid values only, excluding the value being replaced.
- If you do not select Output imputed values as a separate field, the original field is overwritten with the imputed result.
Example
Suppose you have the following dataset with missing values in numeric fields.| Product | Price | Total_Sale |
|---|---|---|
| Shirt | 20.0 | 200.0 |
| Pants | null | 150.0 |
| Jacket | 50.0 | null |
| Shoes | 30.0 | 300.0 |
Total_Sale.
Result
| Product | Price | Total_Sale |
|---|---|---|
| Shirt | 20.0 | 200.0 |
| Pants | 33.3 | 150.0 |
| Jacket | 50.0 | 216.7 |
| Shoes | 30.0 | 300.0 |

