Follow this quickstart on the Enterprise
Edition only.
Objectives
In this quickstart, you will:- Create a new project and attach a fabric.
- Develop a pipeline with a source and a transformation.
- Run the pipeline interactively.
- Save your changes.
Prerequisites
To complete this quickstart, you need:- A configured fabric (your execution environment).
- Create a fabric that connects to Prophecy-managed Databricks. Free trial users will automatically have this fabric.
- Ask a team admin to create a fabric for you that connects to an existing external Spark engine.
Create a project
- Click on the Create Entity button in the left navigation bar.
- Hover over the Project tile and select Create.
- Give your project a name.
- Under Team, select your personal team. (It will match your individual user email.)
- Under Select Template, choose Custom.
- For the Project Type, choose Spark/Python (PySpark).
- Click Continue.
- Under Connect Git Account, select Prophecy Managed Git Credentials.
- Click Continue.
- Take a brief look at the default project packages, and then click Complete.
Build a pipeline
Let’s set up the pipeline.- Ensure the pipeline links to the correct project.
- Create a new development branch called
devQS. Your pipeline will not appear in themainbranch until you merge your changes. - Name your pipeline
weather. - Leave the default Batch processing mode.
- Click Create New.
- In the project header, click Attach a cluster.
- Choose an appropriate fabric to connect you to the Spark environment.
- Select an existing cluster, or create a new one. New clusters may take a few minutes to start up.
If you have trouble attaching a cluster, you might not have the right permissions to access or
create a cluster in your external Spark environment.
Add a source
For this quickstart, you’ll create a Seed as the data source.- Open the Source/Target gem category.
- Click Source. This adds a new Source gem to the canvas.
- Hover over the gem and click Open.
- Select + New Dataset.
- Name the dataset
weather_forecast. - In the Type & Format tab, select the Seed type.
- In the Data tab, paste the following data provided in CSV format. Then, click Next.
- Click the Infer Schema button.
- Review the inferred schema. Depending on your Spark engine, you might see different inferred types.
- If the DatePrediction column is assigned a
stringtype, change the type todate. - Enable the Enforce specified or inferred schema checkbox to enforce this change downstream.
- Optional: Click on the Copilot icon to generate metadata descriptions of each column.
- Click Next.
- Click Load Data to preview the data in tabular format.
- Click Create Dataset to save your seed as a dataset.
Add a reformat transformation
Now, you’ll configure your first data transformation using the Reformat gem.- From the Transform gem category, add a Reformat gem to your canvas.
- Drag the Reformat gem near your Table gem to auto-connect them.
- Open the Reformat gem configuration.
- Notice that the first input port in0 displays your table and its schema.
- Hover over your table name, and click Add 5 columns.
- Change the WindSpeed target column name to
WindSpeedKMH. This renames the column. - Add a new target column called
TemperatureFahrenheit. - Next to the new target column, write the expression
(((TemperatureCelsius * 9.0D) / 5.0D) + 32)to convert the temperature into Fahrenheit. If your column name is descriptive, Copilot will write an expression for you. - After configuring the expression, click Save.
By default, gem expressions expect Spark SQL code.
Generate data previews
At this point, you may be curious to know what your data looks like. Generate data previews with the following steps:- Click the play button in the bottom right corner of the canvas.
- As the pipeline runs, preview icons should appear as gem outputs.
- Click on the Reformat output to preview the data in the Data Explorer.
Save the pipeline
In a real-world situation, your pipeline would be much more complex. Typically, pipelines require multiple transformation steps and send data to external outputs. For the purposes of this tutorial, we will save the pipeline as-is.- In the project footer, click Commit Changes. This opens the Git workflow dialog.
- Review the commit history on the left side. You should only see the initial project commit at this point.
- Review the entities changed in this commit on the right side. You should see the new weather pipeline and weather_forecast dataset.
- Verify or update the Copilot-generated commit message that describes these changes.
- Click Commit.
main branch.
What’s next
Continue your Prophecy learning journey:- Discover the different Spark gems that you can use for data transformation
- Reach out to us if you need additional help or guidance

