Skip to main content
Pipeline generation turns a short natural language brief into a draft graph: sources, transforms, joins, quality checks, and destinations connected with edges. You treat the result as a starting point—review credentials, partitioning, and orchestration before production.

Describe what you want

In Copilot chat or the dedicated Generate pipeline flow, write what the pipeline should accomplish. Effective briefs include:
  • Sources (systems, buckets, tables, APIs)
  • Transformations (parsing, deduplication, typing, aggregations)
  • Destination and load strategy (truncate, merge, append)
  • Freshness (batch vs near-real-time) if it changes node choice
Example prompt:
Ingest gzipped JSON from s3://analytics-lake/raw/events/, parse nested
`properties`, dedupe on event_id, write daily partitions to Snowflake
table analytics.stg_events, and run a row-count check against source.

Generated nodes and edges

The assistant proposes a node set wired in order: extract, flatten, validate, load. It may add:
  • Parser nodes for semi-structured formats
  • Column mapping or type enforcement nodes
  • Data quality rules (null checks, uniqueness)
  • Basic orchestration placeholders (manual trigger until you attach a schedule)
Open each node and confirm parameters match your environment names, not the placeholders Copilot guessed.
1

Generate the draft

Submit your brief and wait for the graph preview. Decline and rephrase if the topology misses a critical step.
2

Bind connections

Replace placeholder connections with real credentials and test connectivity.
3

Run preview

Execute a limited preview to validate parsing and schema. Adjust SQL or mappings based on errors.
4

Add orchestration

Attach schedules, triggers, or webhooks as needed.

Enhance existing pipelines

Open an existing pipeline and ask Copilot to add or refactor sections:
  • “Insert a null-safe email normalization step before the warehouse load.”
  • “Split the JSON array line_items into a child table load with keys from the parent.”
  • “Add a failure branch that writes bad rows to a quarantine bucket.”
The assistant inserts nodes and reconnects edges where the editor API allows; you verify that edge order and fan-in/fan-out still match runtime semantics.
Generated graphs may omit secrets handling, PII masking, or region constraints. Apply your organization’s standards before merge.

Quality checklist

Confirm merge keys and incremental cursors so reruns do not duplicate production data.
Large cluster defaults may be oversized for dev; downscale compute in Compute settings for tests.
Add a short pipeline description for teammates; Copilot text is not a substitute for owned runbooks.