Documentation Index
Fetch the complete documentation index at: https://docs.prophecy.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The CountRecords gem allows you to count the number of rows in a dataset in different ways. You can count all rows, count non-null values in selected columns, or count distinct non-null values in selected columns.
The gem has a corresponding interactive gem example. See Interactive gem
examples to learn how to run sample pipelines
for this and other gems.
Prerequisites
- Add
prophecy_basics package version 1.0.0 or higher to your project.
The CountRecords gem accepts the following input and output.
| Port | Description |
|---|
| in0 | Input dataset with the columns to count. |
| out | Output dataset with the resulting count(s). Output has one row with the selected count(s). |
Parameters
Configure the CountRecords gem using the following parameters.
| Parameter | Description |
|---|
| Count option | Choose how the data should be counted. See Count options below. |
| Select columns to count | One or more columns to count. Required for counting non-null records or distinct records. |
Count options
Choose one of the following strategies for counting records.
| Strategy | Description |
|---|
| Count number of total records | Returns the total number of rows in the input dataset, including null values. |
| Count non-null records in selected column(s) | Returns the number of non-null rows for each selected column. |
| Count distinct records in selected column(s) | Returns the number of distinct, non-null values for each selected column. |
Example
Given a table of patient visits:
| PatientID | VisitDate | Department | Diagnosis |
|---|
| 1 | 2024-01-01 | Cardiology | Flu |
| 2 | 2024-01-02 | Oncology | Cancer |
| 3 | 2024-01-03 | Cardiology | Flu |
| 4 | 2024-01-04 | NULL | Cold |
If you choose:
- Count distinct records on
Department: the result will be 2 (Cardiology, Oncology).
- Count non-null records on
Department: the result will be 3.
- Count total number of records: the result will be
4.